Re: connection time out

2015-11-29 Thread Yuheng Du
Also, I can see the topic "speedx2" being created in the broker, but not
message data is coming through.

On Sun, Nov 29, 2015 at 7:00 PM, Yuheng Du  wrote:

> Hi guys,
>
> I was running a single node broker in a cluster. And when I run the
> producer in another cluster, I got connection time out error.
>
> I can ping into port 9092 and other ports on the broker machine from the
> producer. I just can't publish any messages. The command I used to run the
> producer is:
>
> bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance
> speedx2 50 100 -1 acks=1 bootstrap.servers=130.127.xxx.xxx:9092
> buffer.memory=67184 batch.size=8196
>
> Can anyone suggest what the problem might be?
>
>
> Thank you!
>
>
> best,
>
> Yuheng
>


connection time out

2015-11-29 Thread Yuheng Du
Hi guys,

I was running a single node broker in a cluster. And when I run the
producer in another cluster, I got connection time out error.

I can ping into port 9092 and other ports on the broker machine from the
producer. I just can't publish any messages. The command I used to run the
producer is:

bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance
speedx2 50 100 -1 acks=1 bootstrap.servers=130.127.xxx.xxx:9092
buffer.memory=67184 batch.size=8196

Can anyone suggest what the problem might be?


Thank you!


best,

Yuheng


Re: producer api

2015-09-14 Thread Yuheng Du
Thank you Erik. But in my setup, there is only one node whose public ip
provided in my broker cluster, so I can only use one bootstrap broker as
for now.


On Mon, Sep 14, 2015 at 3:50 PM, Helleren, Erik 
wrote:

> You only need one of the brokers to connect for publishing.  Kafka will
> tell the client about all the other brokers.  But best practices state
> including all of them is best.
> -Erik
>
> On 9/14/15, 2:46 PM, "Yuheng Du"  wrote:
>
> >I am writing a kafka producer application in java. I want the producer to
> >publish data to a cluster of 6 brokers. Is there a way to specify only the
> >load balancing node but not all the brokers list?
> >
> >For example, like in the benchmarking kafka commandssdg:
> >
> >bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance
> >test 5000 100 -1 acks=-1 bootstrap.servers=
> >esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864
> batch.size=64000
> >
> >it only specifies the bootstrap server node but not all the broker list
> >like:
> >
> >Properties props = new Properties();
> >
> >props.put("metadata.broker.list", "broker1:9092,broker2:9092");
> >
> >
> >
> >Thanks for replying.
>
>


producer api

2015-09-14 Thread Yuheng Du
I am writing a kafka producer application in java. I want the producer to
publish data to a cluster of 6 brokers. Is there a way to specify only the
load balancing node but not all the brokers list?

For example, like in the benchmarking kafka commandssdg:

bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance
test 5000 100 -1 acks=-1 bootstrap.servers=
esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=64000

it only specifies the bootstrap server node but not all the broker list
like:

Properties props = new Properties();

props.put("metadata.broker.list", "broker1:9092,broker2:9092");



Thanks for replying.


Re: latency test

2015-09-09 Thread Yuheng Du
Thank you Erik.

In my test I am using fixed 200bytes messages and I run 500k messages per
producer on 92 physically isolated producers. Each test run takes about 20
minutes. As the broker cluster is migrated into a new physical cluster, I
will perform my test and get the latency results in the next couple of
weeks.

I will keep you posted.

Thanks.

On Wed, Sep 9, 2015 at 4:58 PM, Helleren, Erik 
wrote:

> Yes, and that can really hurt average performance.  All the partitions
> were nearly identical up to the 99%’ile, and had very good performance at
> that level hovering around a few milli’s.  But when looking beyond the
> 99%’ile, there was that clear fork in the distribution where a set of 3
> partitions surged upwards.  This could be for a dozen different reasons:
> Network blips, noisy networks, location in the network, resource
> contention on that broker, etc.  But it effected that one broker more than
> others.  And the reasons for my cluster displaying this behavior could be
> very different than the reason for any other cluster.
>
> Its worth noting that this was mostly a latency test over a stress test.
> There was a single kafka producer object, very small message sizes (100
> bytes), and it was only pushing through around 5MB/s worth of data. And
> the client was configured to minimize the amount of data that would be on
> the internal queue/buffer waiting to be sent.  The messages that were
> being sent were compromised of 10 byte ascii ‘words’ selected randomly
> from a dictionary of 1000 words, which benefits compression while still
> resulting in likely unique messages.  And the test I ran was only for 6
> min, and I did not do the work required to see if there was a burst of
> slower messages which caused this behavior, or if it was a consistent
> issue with that node.
> -Erik
>
>
> On 9/9/15, 2:24 PM, "Yuheng Du"  wrote:
>
> >So are you suggesting that the long delays happened in %1 percentile
> >happens in the slower partitions that are further away? Thanks.
> >
> >On Wed, Sep 9, 2015 at 3:15 PM, Helleren, Erik
> >
> >wrote:
> >
> >> So, I did my own latency test on a cluster of 3 nodes, and there is a
> >> significant difference around the 99%’ile and higher for partitions when
> >> measuring the the ack time when configured for a single ack.  The graph
> >> that I wish I could attach or post clearly shows that around 1/3 of the
> >> partitions significantly diverge from the other two.  So, at least in my
> >> case, one of my brokers is further than the others.
> >> -Erik
> >>
> >> On 9/4/15, 1:06 PM, "Yuheng Du"  wrote:
> >>
> >> >No problem. Thanks for your advice. I think it would be fun to
> >>explore. I
> >> >only know how to program in java though. Hope it will work.
> >> >
> >> >On Fri, Sep 4, 2015 at 2:03 PM, Helleren, Erik
> >> >
> >> >wrote:
> >> >
> >> >> I thing the suggestion is to have partitions/brokers >=1, so 32
> >>should
> >> >>be
> >> >> enough.
> >> >>
> >> >> As for latency tests, there isn’t a lot of code to do a latency test.
> >> >>If
> >> >> you just want to measure ack time its around 100 lines.  I will try
> >>to
> >> >> push out some good latency testing code to github, but my company is
> >> >> scared of open sourcing code… so it might be a while…
> >> >> -Erik
> >> >>
> >> >>
> >> >> On 9/4/15, 12:55 PM, "Yuheng Du"  wrote:
> >> >>
> >> >> >Thanks for your reply Erik. I am running some more tests according
> >>to
> >> >>your
> >> >> >suggestions now and I will share with my results here. Is it
> >>necessary
> >> >>to
> >> >> >use a fixed number of partitions (32 partitions maybe) for my test?
> >> >> >
> >> >> >I am testing 2, 4, 8, 16 and 32 brokers scenarios, all of them are
> >> >>running
> >> >> >on individual physical nodes. So I think using at least 32
> >>partitions
> >> >>will
> >> >> >make more sense? I have seen latencies increase as the number of
> >> >> >partitions
> >> >> >goes up in my experiments.
> >> >> >
> >> >> >To get the latency of each event data recorded, are you suggesting
> >> >>that I
> >> >> >rewrite my own test program (in Java perhaps) or I can just modify

Re: latency test

2015-09-09 Thread Yuheng Du
So are you suggesting that the long delays happened in %1 percentile
happens in the slower partitions that are further away? Thanks.

On Wed, Sep 9, 2015 at 3:15 PM, Helleren, Erik 
wrote:

> So, I did my own latency test on a cluster of 3 nodes, and there is a
> significant difference around the 99%’ile and higher for partitions when
> measuring the the ack time when configured for a single ack.  The graph
> that I wish I could attach or post clearly shows that around 1/3 of the
> partitions significantly diverge from the other two.  So, at least in my
> case, one of my brokers is further than the others.
> -Erik
>
> On 9/4/15, 1:06 PM, "Yuheng Du"  wrote:
>
> >No problem. Thanks for your advice. I think it would be fun to explore. I
> >only know how to program in java though. Hope it will work.
> >
> >On Fri, Sep 4, 2015 at 2:03 PM, Helleren, Erik
> >
> >wrote:
> >
> >> I thing the suggestion is to have partitions/brokers >=1, so 32 should
> >>be
> >> enough.
> >>
> >> As for latency tests, there isn’t a lot of code to do a latency test.
> >>If
> >> you just want to measure ack time its around 100 lines.  I will try to
> >> push out some good latency testing code to github, but my company is
> >> scared of open sourcing code… so it might be a while…
> >> -Erik
> >>
> >>
> >> On 9/4/15, 12:55 PM, "Yuheng Du"  wrote:
> >>
> >> >Thanks for your reply Erik. I am running some more tests according to
> >>your
> >> >suggestions now and I will share with my results here. Is it necessary
> >>to
> >> >use a fixed number of partitions (32 partitions maybe) for my test?
> >> >
> >> >I am testing 2, 4, 8, 16 and 32 brokers scenarios, all of them are
> >>running
> >> >on individual physical nodes. So I think using at least 32 partitions
> >>will
> >> >make more sense? I have seen latencies increase as the number of
> >> >partitions
> >> >goes up in my experiments.
> >> >
> >> >To get the latency of each event data recorded, are you suggesting
> >>that I
> >> >rewrite my own test program (in Java perhaps) or I can just modify the
> >> >standard test program provided by kafka (
> >> >https://gist.github.com/jkreps/c7ddb4041ef62a900e6c )? I guess I need
> >>to
> >> >rebuild the source if I modify the standard java test program
> >> >ProducerPerformance provided in kafka, right? Now this standard program
> >> >only has average latencies and percentile latencies but no per event
> >> >latencies.
> >> >
> >> >Thanks.
> >> >
> >> >On Fri, Sep 4, 2015 at 1:42 PM, Helleren, Erik
> >> >
> >> >wrote:
> >> >
> >> >> That is an excellent question!  There are a bunch of ways to monitor
> >> >> jitter and see when that is happening.  Here are a few:
> >> >>
> >> >> - You could slice the histogram every few seconds, save it out with a
> >> >> timestamp, and then look at how they compare.  This would be mostly
> >> >> manual, or you can graph line charts of the percentiles over time in
> >> >>excel
> >> >> where each percentile would be a series.  If you are using HDR
> >> >>Histogram,
> >> >> you should look at how to use the Recorder class to do this coupled
> >> >>with a
> >> >> ScheduledExecutorService.
> >> >>
> >> >> - You can just save the starting timestamp of the event and the
> >>latency
> >> >>of
> >> >> each event.  If you put it into a CSV, you can just load it up into
> >> >>excel
> >> >> and graph as a XY chart.  That way you can see every point during the
> >> >> running of your program and you can see trends.  You want to be
> >>careful
> >> >> about this one, especially of writing to a file in the callback that
> >> >>kfaka
> >> >> provides.
> >> >>
> >> >> Also, I have noticed that most of the very slow observations are at
> >> >> startup.  But don’t trust me, trust the data and share your findings.
> >> >> Also, having a 99.9 percentile provides a pretty good standard for
> >> >>typical
> >> >> poor case performance.  Average is borderline useless, 50%’ile is a
> >> >>better
> >> >> typica

Re: latency test

2015-09-04 Thread Yuheng Du
No problem. Thanks for your advice. I think it would be fun to explore. I
only know how to program in java though. Hope it will work.

On Fri, Sep 4, 2015 at 2:03 PM, Helleren, Erik 
wrote:

> I thing the suggestion is to have partitions/brokers >=1, so 32 should be
> enough.
>
> As for latency tests, there isn’t a lot of code to do a latency test.  If
> you just want to measure ack time its around 100 lines.  I will try to
> push out some good latency testing code to github, but my company is
> scared of open sourcing code… so it might be a while…
> -Erik
>
>
> On 9/4/15, 12:55 PM, "Yuheng Du"  wrote:
>
> >Thanks for your reply Erik. I am running some more tests according to your
> >suggestions now and I will share with my results here. Is it necessary to
> >use a fixed number of partitions (32 partitions maybe) for my test?
> >
> >I am testing 2, 4, 8, 16 and 32 brokers scenarios, all of them are running
> >on individual physical nodes. So I think using at least 32 partitions will
> >make more sense? I have seen latencies increase as the number of
> >partitions
> >goes up in my experiments.
> >
> >To get the latency of each event data recorded, are you suggesting that I
> >rewrite my own test program (in Java perhaps) or I can just modify the
> >standard test program provided by kafka (
> >https://gist.github.com/jkreps/c7ddb4041ef62a900e6c )? I guess I need to
> >rebuild the source if I modify the standard java test program
> >ProducerPerformance provided in kafka, right? Now this standard program
> >only has average latencies and percentile latencies but no per event
> >latencies.
> >
> >Thanks.
> >
> >On Fri, Sep 4, 2015 at 1:42 PM, Helleren, Erik
> >
> >wrote:
> >
> >> That is an excellent question!  There are a bunch of ways to monitor
> >> jitter and see when that is happening.  Here are a few:
> >>
> >> - You could slice the histogram every few seconds, save it out with a
> >> timestamp, and then look at how they compare.  This would be mostly
> >> manual, or you can graph line charts of the percentiles over time in
> >>excel
> >> where each percentile would be a series.  If you are using HDR
> >>Histogram,
> >> you should look at how to use the Recorder class to do this coupled
> >>with a
> >> ScheduledExecutorService.
> >>
> >> - You can just save the starting timestamp of the event and the latency
> >>of
> >> each event.  If you put it into a CSV, you can just load it up into
> >>excel
> >> and graph as a XY chart.  That way you can see every point during the
> >> running of your program and you can see trends.  You want to be careful
> >> about this one, especially of writing to a file in the callback that
> >>kfaka
> >> provides.
> >>
> >> Also, I have noticed that most of the very slow observations are at
> >> startup.  But don’t trust me, trust the data and share your findings.
> >> Also, having a 99.9 percentile provides a pretty good standard for
> >>typical
> >> poor case performance.  Average is borderline useless, 50%’ile is a
> >>better
> >> typical case because that’s the number that says “half of events will be
> >> this slow or faster”, or for values that are high like 99.9%’ile, “0.1%
> >>of
> >> all events will be slower than this”.
> >> -Erik
> >>
> >> On 9/4/15, 12:05 PM, "Yuheng Du"  wrote:
> >>
> >> >Thank you Erik! That's is helpful!
> >> >
> >> >But also I see jitters of the maximum latencies when running the
> >> >experiment.
> >> >
> >> >The average end to acknowledgement latency from producer to broker is
> >> >around 5ms when using 92 producers and 4 brokers, and the 99.9
> >>percentile
> >> >latency is 58ms, but the maximum latency goes up to 1359 ms. How to
> >>locate
> >> >the source of this jitter?
> >> >
> >> >Thanks.
> >> >
> >> >On Fri, Sep 4, 2015 at 10:54 AM, Helleren, Erik
> >> >
> >> >wrote:
> >> >
> >> >> WellŠ not to be contrarian, but latency depends much more on the
> >>latency
> >> >> between the producer and the broker that is the leader for the
> >>partition
> >> >> you are publishing to.  At least when your brokers are not saturated
> >> >>with
> >> >> messages, and acks are set to 1.  If acks are set to ALL, latency

Re: latency test

2015-09-04 Thread Yuheng Du
Thanks for your reply Erik. I am running some more tests according to your
suggestions now and I will share with my results here. Is it necessary to
use a fixed number of partitions (32 partitions maybe) for my test?

I am testing 2, 4, 8, 16 and 32 brokers scenarios, all of them are running
on individual physical nodes. So I think using at least 32 partitions will
make more sense? I have seen latencies increase as the number of partitions
goes up in my experiments.

To get the latency of each event data recorded, are you suggesting that I
rewrite my own test program (in Java perhaps) or I can just modify the
standard test program provided by kafka (
https://gist.github.com/jkreps/c7ddb4041ef62a900e6c )? I guess I need to
rebuild the source if I modify the standard java test program
ProducerPerformance provided in kafka, right? Now this standard program
only has average latencies and percentile latencies but no per event
latencies.

Thanks.

On Fri, Sep 4, 2015 at 1:42 PM, Helleren, Erik 
wrote:

> That is an excellent question!  There are a bunch of ways to monitor
> jitter and see when that is happening.  Here are a few:
>
> - You could slice the histogram every few seconds, save it out with a
> timestamp, and then look at how they compare.  This would be mostly
> manual, or you can graph line charts of the percentiles over time in excel
> where each percentile would be a series.  If you are using HDR Histogram,
> you should look at how to use the Recorder class to do this coupled with a
> ScheduledExecutorService.
>
> - You can just save the starting timestamp of the event and the latency of
> each event.  If you put it into a CSV, you can just load it up into excel
> and graph as a XY chart.  That way you can see every point during the
> running of your program and you can see trends.  You want to be careful
> about this one, especially of writing to a file in the callback that kfaka
> provides.
>
> Also, I have noticed that most of the very slow observations are at
> startup.  But don’t trust me, trust the data and share your findings.
> Also, having a 99.9 percentile provides a pretty good standard for typical
> poor case performance.  Average is borderline useless, 50%’ile is a better
> typical case because that’s the number that says “half of events will be
> this slow or faster”, or for values that are high like 99.9%’ile, “0.1% of
> all events will be slower than this”.
> -Erik
>
> On 9/4/15, 12:05 PM, "Yuheng Du"  wrote:
>
> >Thank you Erik! That's is helpful!
> >
> >But also I see jitters of the maximum latencies when running the
> >experiment.
> >
> >The average end to acknowledgement latency from producer to broker is
> >around 5ms when using 92 producers and 4 brokers, and the 99.9 percentile
> >latency is 58ms, but the maximum latency goes up to 1359 ms. How to locate
> >the source of this jitter?
> >
> >Thanks.
> >
> >On Fri, Sep 4, 2015 at 10:54 AM, Helleren, Erik
> >
> >wrote:
> >
> >> WellŠ not to be contrarian, but latency depends much more on the latency
> >> between the producer and the broker that is the leader for the partition
> >> you are publishing to.  At least when your brokers are not saturated
> >>with
> >> messages, and acks are set to 1.  If acks are set to ALL, latency on an
> >> non-saturated kafka cluster will be: Round Trip Latency from producer to
> >> leader for partition + Max( slowest Round Trip Latency to a replicas of
> >> that partition).  If a cluster is saturated with messages, we have to
> >> assume that all partitions receive an equal distribution of messages to
> >> avoid linear algebra and queueing theory models.  I don¹t like linear
> >> algebra :P
> >>
> >> Since you are probably putting all your latencies into a single
> >>histogram
> >> per producer, or worse, just an average, this pattern would have been
> >> obscured.  Obligatory lecture about measuring latency by Gil Tene
> >> (https://www.youtube.com/watch?v=9MKY4KypBzg).  To verify this
> >>hypothesis,
> >> you should re-write the benchmark to plot the latencies for each write
> >>to
> >> a partition for each producer into a histogram. (HRD histogram is pretty
> >> good for that).  This would give you producers*partitions histograms,
> >> which might be unwieldy for that many producers. But wait, there is
> >>hope!
> >>
> >> To verify that this hypothesis holds, you just have to see that there
> >>is a
> >> significant difference between different partitions on a SINGLE
> >>producing
> >> client. So, pick one producing client at random an

Re: latency test

2015-09-04 Thread Yuheng Du
Thank you Erik! That's is helpful!

But also I see jitters of the maximum latencies when running the
experiment.

The average end to acknowledgement latency from producer to broker is
around 5ms when using 92 producers and 4 brokers, and the 99.9 percentile
latency is 58ms, but the maximum latency goes up to 1359 ms. How to locate
the source of this jitter?

Thanks.

On Fri, Sep 4, 2015 at 10:54 AM, Helleren, Erik 
wrote:

> WellŠ not to be contrarian, but latency depends much more on the latency
> between the producer and the broker that is the leader for the partition
> you are publishing to.  At least when your brokers are not saturated with
> messages, and acks are set to 1.  If acks are set to ALL, latency on an
> non-saturated kafka cluster will be: Round Trip Latency from producer to
> leader for partition + Max( slowest Round Trip Latency to a replicas of
> that partition).  If a cluster is saturated with messages, we have to
> assume that all partitions receive an equal distribution of messages to
> avoid linear algebra and queueing theory models.  I don¹t like linear
> algebra :P
>
> Since you are probably putting all your latencies into a single histogram
> per producer, or worse, just an average, this pattern would have been
> obscured.  Obligatory lecture about measuring latency by Gil Tene
> (https://www.youtube.com/watch?v=9MKY4KypBzg).  To verify this hypothesis,
> you should re-write the benchmark to plot the latencies for each write to
> a partition for each producer into a histogram. (HRD histogram is pretty
> good for that).  This would give you producers*partitions histograms,
> which might be unwieldy for that many producers. But wait, there is hope!
>
> To verify that this hypothesis holds, you just have to see that there is a
> significant difference between different partitions on a SINGLE producing
> client. So, pick one producing client at random and use the data from
> that. The easy way to do that is just plot all the partition latency
> histograms on top of each other in the same plot, that way you have a
> pretty plot to show people.  If you don¹t want to setup plotting, you can
> just compare the medians (50¹th percentile) of the partitions¹ histograms.
>  If there is a lot of variance, your latency anomaly is explained by
> brokers 4-7 being slower than nodes 0-3!  If there isn¹t a lot of variance
> at 50%, look at higher percentiles.  And if higher percentiles for all the
> partitions look the same, this hypothesis is disproved.
>
> If you want to make a general statement about latency of writing to kafka,
> you can merge all the histograms into a single histogram and plot that.
>
> To Yuheng¹s credit, more brokers always results in more throughput. But
> throughput and latency are two different creatures.  Its worth noting that
> kafka is designed to be high throughput first and low latency second.  And
> it does a really good job at both.
>
> Disclaimer: I might not like linear algebra, but I do like statistics.
> Let me know if there are topics that need more explanation above that
> aren¹t covered by Gil¹s lecture.
> -Erik
>
> On 9/4/15, 9:03 AM, "Yuheng Du"  wrote:
>
> >When I using 32 partitions, the 4 brokers latency becomes larger than the
> >8
> >brokers latency.
> >
> >So is it always true that using more brokers can give less latency when
> >the
> >number of partitions is at least the size of the brokers?
> >
> >Thanks.
> >
> >On Thu, Sep 3, 2015 at 10:45 PM, Yuheng Du 
> >wrote:
> >
> >> I am running a producer latency test. When using 92 producers in 92
> >> physical node publishing to 4 brokers, the latency is slightly lower
> >>than
> >> using 8 brokers, I am using 8 partitions for the topic.
> >>
> >> I have rerun the test and it gives me the same result, the 4 brokers
> >> scenario still has lower latency than the 8 brokers scenarios.
> >>
> >> It is weird because I tested 1broker, 2 brokers, 4 brokers, 8 brokers,
> >>16
> >> brokers and 32 brokers. For the rest of the case the latency decreases
> >>as
> >> the number of brokers increase.
> >>
> >> 4 brokers/8 brokers is the only pair that doesn't satisfy this rule.
> >>What
> >> could be the cause?
> >>
> >> I am using a 200 bytes message, the test let each producer publishes
> >>500k
> >> messages to a given topic. Every test run when I change the number of
> >> brokers, I use a new topic.
> >>
> >> Thanks for any advices.
> >>
>
>


Re: When a message is exposed to the consumer

2015-09-04 Thread Yuheng Du
Can't read it. Sorry

On Fri, Sep 4, 2015 at 12:08 PM, Roman Shramkov 
wrote:

> Её ай н Анны уйг
>
> sent from a mobile device, please excuse brevity and typos
>
>
> ----Пользователь Yuheng Du написал 
>
> According to the section 3.1 of the paper "Kafka: a Distributed Messaging
> System for Log Processing":
>
> "a message is only exposed to the consumers after it is flushed"?
>
> Is it still true in the current kafka? like the message can only be
> available after it is flushed to disk?
>
> Thanks.
>


Re: latency test

2015-09-04 Thread Yuheng Du
When I using 32 partitions, the 4 brokers latency becomes larger than the 8
brokers latency.

So is it always true that using more brokers can give less latency when the
number of partitions is at least the size of the brokers?

Thanks.

On Thu, Sep 3, 2015 at 10:45 PM, Yuheng Du  wrote:

> I am running a producer latency test. When using 92 producers in 92
> physical node publishing to 4 brokers, the latency is slightly lower than
> using 8 brokers, I am using 8 partitions for the topic.
>
> I have rerun the test and it gives me the same result, the 4 brokers
> scenario still has lower latency than the 8 brokers scenarios.
>
> It is weird because I tested 1broker, 2 brokers, 4 brokers, 8 brokers, 16
> brokers and 32 brokers. For the rest of the case the latency decreases as
> the number of brokers increase.
>
> 4 brokers/8 brokers is the only pair that doesn't satisfy this rule. What
> could be the cause?
>
> I am using a 200 bytes message, the test let each producer publishes 500k
> messages to a given topic. Every test run when I change the number of
> brokers, I use a new topic.
>
> Thanks for any advices.
>


When a message is exposed to the consumer

2015-09-04 Thread Yuheng Du
According to the section 3.1 of the paper "Kafka: a Distributed Messaging
System for Log Processing":

"a message is only exposed to the consumers after it is flushed"?

Is it still true in the current kafka? like the message can only be
available after it is flushed to disk?

Thanks.


latency test

2015-09-03 Thread Yuheng Du
I am running a producer latency test. When using 92 producers in 92
physical node publishing to 4 brokers, the latency is slightly lower than
using 8 brokers, I am using 8 partitions for the topic.

I have rerun the test and it gives me the same result, the 4 brokers
scenario still has lower latency than the 8 brokers scenarios.

It is weird because I tested 1broker, 2 brokers, 4 brokers, 8 brokers, 16
brokers and 32 brokers. For the rest of the case the latency decreases as
the number of brokers increase.

4 brokers/8 brokers is the only pair that doesn't satisfy this rule. What
could be the cause?

I am using a 200 bytes message, the test let each producer publishes 500k
messages to a given topic. Every test run when I change the number of
brokers, I use a new topic.

Thanks for any advices.


Re: Reduce latency

2015-08-18 Thread Yuheng Du
I see. So the internal queue overwrites the producer buffer size
configuration? When buffer is full the producer will block sending, right?

On Tue, Aug 18, 2015 at 3:52 PM, Tao Feng  wrote:

> From what I understand, if you set the throughput to -1, the
> producerperformance will push records as much as possible to an internal
> per topic per partition queue. In the background there is a sender IO
> thread handling the actual record sending process. If you push record to
> the queue faster than the send rate, your queue  will become longer and
> longer, eventually record latency will become meaningless for a
> latency-purpose test.
>
>
> On Tue, Aug 18, 2015 at 11:48 AM, Yuheng Du 
> wrote:
>
> > I see. Thank you Tao. But now I don't get it what Jay said that my
> latency
> > test only makes sense if I set a fixed throughput. Why do I need to set a
> > fixed throughput for my test instead of just set the expected throughput
> to
> > be -1 (as much as possible)?
> >
> > Thanks.
> >
> > On Tue, Aug 18, 2015 at 2:43 PM, Tao Feng  wrote:
> >
> > > Hi Yuheng,
> > >
> > > The 1 record/s is just a param for producerperformance for your
> > > producer target tput. It only takes effect to do the throttling if you
> > > tries to send more than 1 record/s.  The actual tput of the test
> > > depends on your producer config and your setup.
> > >
> > > -Tao
> > >
> > > On Tue, Aug 18, 2015 at 11:34 AM, Yuheng Du 
> > > wrote:
> > >
> > > > Also, When I set the target throughput to be 1 records/s, The
> > actual
> > > > test results show I got an average of 579.86 records per second among
> > all
> > > > my producers. How did that happen? Why this number is not 1 then?
> > > > Thanks.
> > > >
> > > > On Tue, Aug 18, 2015 at 10:03 AM, Yuheng Du <
> yuheng.du.h...@gmail.com>
> > > > wrote:
> > > >
> > > > > Thank you Jay, that really helps!
> > > > >
> > > > > Kishore, Where you can monitor whether the network is busy on IO in
> > > > visual
> > > > > vm? Thanks. I am running 90 producer process on 90 physical
> machines
> > in
> > > > the
> > > > > experiment.
> > > > >
> > > > > On Tue, Aug 18, 2015 at 1:19 AM, Jay Kreps 
> wrote:
> > > > >
> > > > >> Yuheng,
> > > > >>
> > > > >> From the command you gave it looks like you are configuring the
> perf
> > > > test
> > > > >> to send data as fast as possible (the -1 for target throughput).
> > This
> > > > >> means
> > > > >> it will always queue up a bunch of unsent data until the buffer is
> > > > >> exhausted and then block. The larger the buffer, the bigger the
> > queue.
> > > > >> This
> > > > >> is where the latency comes from. This is exactly what you would
> > expect
> > > > and
> > > > >> what the buffering is supposed to do.
> > > > >>
> > > > >> If you want to measure latency this test doesn't really make
> sense,
> > > you
> > > > >> need to measure with some fixed throughput. Instead of -1 enter
> the
> > > > target
> > > > >> throughput you want to measure latency at (e.g. 10
> records/sec).
> > > > >>
> > > > >> -Jay
> > > > >>
> > > > >> On Thu, Aug 13, 2015 at 12:18 PM, Yuheng Du <
> > yuheng.du.h...@gmail.com
> > > >
> > > > >> wrote:
> > > > >>
> > > > >> > Thank you Alvaro,
> > > > >> >
> > > > >> > How to use sync producers? I am running the standard
> > > > ProducerPerformance
> > > > >> > test from kafka to measure the latency of each message to send
> > from
> > > > >> > producer to broker only.
> > > > >> > The command is like "bin/kafka-run-class.sh
> > > > >> > org.apache.kafka.clients.tools.ProducerPerformance test7
> 5000
> > > 100
> > > > -1
> > > > >> > acks=1 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092
> > > > >> > buffer.memory=67108864 batch.size=8196"
> > > > >> >
> > > > >> > For running producers

Re: Reduce latency

2015-08-18 Thread Yuheng Du
I see. Thank you Tao. But now I don't get it what Jay said that my latency
test only makes sense if I set a fixed throughput. Why do I need to set a
fixed throughput for my test instead of just set the expected throughput to
be -1 (as much as possible)?

Thanks.

On Tue, Aug 18, 2015 at 2:43 PM, Tao Feng  wrote:

> Hi Yuheng,
>
> The 1 record/s is just a param for producerperformance for your
> producer target tput. It only takes effect to do the throttling if you
> tries to send more than 1 record/s.  The actual tput of the test
> depends on your producer config and your setup.
>
> -Tao
>
> On Tue, Aug 18, 2015 at 11:34 AM, Yuheng Du 
> wrote:
>
> > Also, When I set the target throughput to be 1 records/s, The actual
> > test results show I got an average of 579.86 records per second among all
> > my producers. How did that happen? Why this number is not 1 then?
> > Thanks.
> >
> > On Tue, Aug 18, 2015 at 10:03 AM, Yuheng Du 
> > wrote:
> >
> > > Thank you Jay, that really helps!
> > >
> > > Kishore, Where you can monitor whether the network is busy on IO in
> > visual
> > > vm? Thanks. I am running 90 producer process on 90 physical machines in
> > the
> > > experiment.
> > >
> > > On Tue, Aug 18, 2015 at 1:19 AM, Jay Kreps  wrote:
> > >
> > >> Yuheng,
> > >>
> > >> From the command you gave it looks like you are configuring the perf
> > test
> > >> to send data as fast as possible (the -1 for target throughput). This
> > >> means
> > >> it will always queue up a bunch of unsent data until the buffer is
> > >> exhausted and then block. The larger the buffer, the bigger the queue.
> > >> This
> > >> is where the latency comes from. This is exactly what you would expect
> > and
> > >> what the buffering is supposed to do.
> > >>
> > >> If you want to measure latency this test doesn't really make sense,
> you
> > >> need to measure with some fixed throughput. Instead of -1 enter the
> > target
> > >> throughput you want to measure latency at (e.g. 10 records/sec).
> > >>
> > >> -Jay
> > >>
> > >> On Thu, Aug 13, 2015 at 12:18 PM, Yuheng Du  >
> > >> wrote:
> > >>
> > >> > Thank you Alvaro,
> > >> >
> > >> > How to use sync producers? I am running the standard
> > ProducerPerformance
> > >> > test from kafka to measure the latency of each message to send from
> > >> > producer to broker only.
> > >> > The command is like "bin/kafka-run-class.sh
> > >> > org.apache.kafka.clients.tools.ProducerPerformance test7 5000
> 100
> > -1
> > >> > acks=1 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092
> > >> > buffer.memory=67108864 batch.size=8196"
> > >> >
> > >> > For running producers, where should I put the producer.type=sync
> > >> > configuration into? The config/server.properties? Also Does this
> mean
> > we
> > >> > are using batch size of 1? Which version of Kafka are you using?
> > >> > thanks.
> > >> >
> > >> > On Thu, Aug 13, 2015 at 3:01 PM, Alvaro Gareppe  >
> > >> > wrote:
> > >> >
> > >> > > Are you measuring latency as time between producer and consumer ?
> > >> > >
> > >> > > In that case, the ack shouldn't affect the latency, cause even
> tough
> > >> your
> > >> > > producer is not going to wait for the ack, the consumer will only
> > get
> > >> the
> > >> > > message after its commited in the server.
> > >> > >
> > >> > > About latency my best result occur with sync producers, but the
> > >> > throughput
> > >> > > is much lower in that case.
> > >> > >
> > >> > > About not flushing to disk I'm pretty sure that it's not an option
> > in
> > >> > kafka
> > >> > > (correct me if I'm wrong)
> > >> > >
> > >> > > Regards,
> > >> > > Alvaro Gareppe
> > >> > >
> > >> > > On Thu, Aug 13, 2015 at 12:59 PM, Yuheng Du <
> > yuheng.du.h...@gmail.com
> > >> >
> > >> > > wrote:
> > >> > >
> > >> > > > Also, the latency results show no major difference when using
> > ack=0
> > >> or
> > >> > > > ack=1. Why is that?
> > >> > > >
> > >> > > > On Thu, Aug 13, 2015 at 11:51 AM, Yuheng Du <
> > >> yuheng.du.h...@gmail.com>
> > >> > > > wrote:
> > >> > > >
> > >> > > > > I am running an experiment where 92 producers is publishing
> data
> > >> > into 6
> > >> > > > > brokers and 10 consumer are reading online data
> simultaneously.
> > >> > > > >
> > >> > > > > How should I do to reduce the latency? Currently when I run
> the
> > >> > > producer
> > >> > > > > performance test the average latency is around 10s.
> > >> > > > >
> > >> > > > > Should I disable log.flush? How to do that? Thanks.
> > >> > > > >
> > >> > > >
> > >> > >
> > >> > >
> > >> > >
> > >> > > --
> > >> > > Ing. Alvaro Gareppe
> > >> > > agare...@gmail.com
> > >> > >
> > >> >
> > >>
> > >
> > >
> >
>


Re: Reduce latency

2015-08-18 Thread Yuheng Du
Also, When I set the target throughput to be 1 records/s, The actual
test results show I got an average of 579.86 records per second among all
my producers. How did that happen? Why this number is not 1 then?
Thanks.

On Tue, Aug 18, 2015 at 10:03 AM, Yuheng Du 
wrote:

> Thank you Jay, that really helps!
>
> Kishore, Where you can monitor whether the network is busy on IO in visual
> vm? Thanks. I am running 90 producer process on 90 physical machines in the
> experiment.
>
> On Tue, Aug 18, 2015 at 1:19 AM, Jay Kreps  wrote:
>
>> Yuheng,
>>
>> From the command you gave it looks like you are configuring the perf test
>> to send data as fast as possible (the -1 for target throughput). This
>> means
>> it will always queue up a bunch of unsent data until the buffer is
>> exhausted and then block. The larger the buffer, the bigger the queue.
>> This
>> is where the latency comes from. This is exactly what you would expect and
>> what the buffering is supposed to do.
>>
>> If you want to measure latency this test doesn't really make sense, you
>> need to measure with some fixed throughput. Instead of -1 enter the target
>> throughput you want to measure latency at (e.g. 10 records/sec).
>>
>> -Jay
>>
>> On Thu, Aug 13, 2015 at 12:18 PM, Yuheng Du 
>> wrote:
>>
>> > Thank you Alvaro,
>> >
>> > How to use sync producers? I am running the standard ProducerPerformance
>> > test from kafka to measure the latency of each message to send from
>> > producer to broker only.
>> > The command is like "bin/kafka-run-class.sh
>> > org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1
>> > acks=1 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092
>> > buffer.memory=67108864 batch.size=8196"
>> >
>> > For running producers, where should I put the producer.type=sync
>> > configuration into? The config/server.properties? Also Does this mean we
>> > are using batch size of 1? Which version of Kafka are you using?
>> > thanks.
>> >
>> > On Thu, Aug 13, 2015 at 3:01 PM, Alvaro Gareppe 
>> > wrote:
>> >
>> > > Are you measuring latency as time between producer and consumer ?
>> > >
>> > > In that case, the ack shouldn't affect the latency, cause even tough
>> your
>> > > producer is not going to wait for the ack, the consumer will only get
>> the
>> > > message after its commited in the server.
>> > >
>> > > About latency my best result occur with sync producers, but the
>> > throughput
>> > > is much lower in that case.
>> > >
>> > > About not flushing to disk I'm pretty sure that it's not an option in
>> > kafka
>> > > (correct me if I'm wrong)
>> > >
>> > > Regards,
>> > > Alvaro Gareppe
>> > >
>> > > On Thu, Aug 13, 2015 at 12:59 PM, Yuheng Du > >
>> > > wrote:
>> > >
>> > > > Also, the latency results show no major difference when using ack=0
>> or
>> > > > ack=1. Why is that?
>> > > >
>> > > > On Thu, Aug 13, 2015 at 11:51 AM, Yuheng Du <
>> yuheng.du.h...@gmail.com>
>> > > > wrote:
>> > > >
>> > > > > I am running an experiment where 92 producers is publishing data
>> > into 6
>> > > > > brokers and 10 consumer are reading online data simultaneously.
>> > > > >
>> > > > > How should I do to reduce the latency? Currently when I run the
>> > > producer
>> > > > > performance test the average latency is around 10s.
>> > > > >
>> > > > > Should I disable log.flush? How to do that? Thanks.
>> > > > >
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > Ing. Alvaro Gareppe
>> > > agare...@gmail.com
>> > >
>> >
>>
>
>


Re: Reduce latency

2015-08-18 Thread Yuheng Du
Thank you Jay, that really helps!

Kishore, Where you can monitor whether the network is busy on IO in visual
vm? Thanks. I am running 90 producer process on 90 physical machines in the
experiment.

On Tue, Aug 18, 2015 at 1:19 AM, Jay Kreps  wrote:

> Yuheng,
>
> From the command you gave it looks like you are configuring the perf test
> to send data as fast as possible (the -1 for target throughput). This means
> it will always queue up a bunch of unsent data until the buffer is
> exhausted and then block. The larger the buffer, the bigger the queue. This
> is where the latency comes from. This is exactly what you would expect and
> what the buffering is supposed to do.
>
> If you want to measure latency this test doesn't really make sense, you
> need to measure with some fixed throughput. Instead of -1 enter the target
> throughput you want to measure latency at (e.g. 10 records/sec).
>
> -Jay
>
> On Thu, Aug 13, 2015 at 12:18 PM, Yuheng Du 
> wrote:
>
> > Thank you Alvaro,
> >
> > How to use sync producers? I am running the standard ProducerPerformance
> > test from kafka to measure the latency of each message to send from
> > producer to broker only.
> > The command is like "bin/kafka-run-class.sh
> > org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1
> > acks=1 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092
> > buffer.memory=67108864 batch.size=8196"
> >
> > For running producers, where should I put the producer.type=sync
> > configuration into? The config/server.properties? Also Does this mean we
> > are using batch size of 1? Which version of Kafka are you using?
> > thanks.
> >
> > On Thu, Aug 13, 2015 at 3:01 PM, Alvaro Gareppe 
> > wrote:
> >
> > > Are you measuring latency as time between producer and consumer ?
> > >
> > > In that case, the ack shouldn't affect the latency, cause even tough
> your
> > > producer is not going to wait for the ack, the consumer will only get
> the
> > > message after its commited in the server.
> > >
> > > About latency my best result occur with sync producers, but the
> > throughput
> > > is much lower in that case.
> > >
> > > About not flushing to disk I'm pretty sure that it's not an option in
> > kafka
> > > (correct me if I'm wrong)
> > >
> > > Regards,
> > > Alvaro Gareppe
> > >
> > > On Thu, Aug 13, 2015 at 12:59 PM, Yuheng Du 
> > > wrote:
> > >
> > > > Also, the latency results show no major difference when using ack=0
> or
> > > > ack=1. Why is that?
> > > >
> > > > On Thu, Aug 13, 2015 at 11:51 AM, Yuheng Du <
> yuheng.du.h...@gmail.com>
> > > > wrote:
> > > >
> > > > > I am running an experiment where 92 producers is publishing data
> > into 6
> > > > > brokers and 10 consumer are reading online data simultaneously.
> > > > >
> > > > > How should I do to reduce the latency? Currently when I run the
> > > producer
> > > > > performance test the average latency is around 10s.
> > > > >
> > > > > Should I disable log.flush? How to do that? Thanks.
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Ing. Alvaro Gareppe
> > > agare...@gmail.com
> > >
> >
>


Re: Reduce latency

2015-08-17 Thread Yuheng Du
Thank you Kishore, I made the buffer twice the size of the batch size and
the latency has reduced significantly.

But is there only one thread io thread sending the batches? Can I increase
the number of threads sending the batches so more than one batch could be
sent at the same time?

Thanks.



On Thu, Aug 13, 2015 at 5:38 PM, Kishore Senji  wrote:

> Your batch.size is 8196 and your buffer.memory is 67108864. This means
> 67108864/8196
> ~ 8188 batches are in memory ready to the sent. There is only one thread io
> thread sending them. I would guess that the io thread (
> kafka-producer-network-thread) would be busy. Please check it in visual vm.
>
> In steady state, every batch has to wait for the previous 8187 batches to
> be done before it gets a chance to be sent out, but the latency is counted
> from the time is added to the queue. This is the reason that you are seeing
> very high end-to-end latency.
>
> Have the buffer.memory to be only twice that of the batch.size so that
> while one is in flight, you can another batch ready to go (and the
> KafkaProducer would block to send more when there is no memory and this way
> the batches are not waiting in the queue unnecessarily) . Also may be you
> want to increase the batch.size further more, you will get even better
> throughput with more or less same latency (as there is no shortage of
> events in the test program).
>
> On Thu, Aug 13, 2015 at 1:13 PM Yuheng Du 
> wrote:
>
> > Yes there is. But if we are using ProducerPerformance test, it's
> configured
> > as giving input when running the test command. Do you write a java
> program
> > to test the latency? Thanks.
> >
> > On Thu, Aug 13, 2015 at 3:54 PM, Alvaro Gareppe 
> > wrote:
> >
> > > I'm using last one, but not using the ProducerPerformance, I created my
> > > own. but I think there is a producer.properties file in config folder
> in
> > > kafka.. is that configuration not for this tester ?
> > >
> > > On Thu, Aug 13, 2015 at 4:18 PM, Yuheng Du 
> > > wrote:
> > >
> > > > Thank you Alvaro,
> > > >
> > > > How to use sync producers? I am running the standard
> > ProducerPerformance
> > > > test from kafka to measure the latency of each message to send from
> > > > producer to broker only.
> > > > The command is like "bin/kafka-run-class.sh
> > > > org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100
> > -1
> > > > acks=1 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092
> > > > buffer.memory=67108864 batch.size=8196"
> > > >
> > > > For running producers, where should I put the producer.type=sync
> > > > configuration into? The config/server.properties? Also Does this mean
> > we
> > > > are using batch size of 1? Which version of Kafka are you using?
> > > > thanks.
> > > >
> > > > On Thu, Aug 13, 2015 at 3:01 PM, Alvaro Gareppe 
> > > > wrote:
> > > >
> > > > > Are you measuring latency as time between producer and consumer ?
> > > > >
> > > > > In that case, the ack shouldn't affect the latency, cause even
> tough
> > > your
> > > > > producer is not going to wait for the ack, the consumer will only
> get
> > > the
> > > > > message after its commited in the server.
> > > > >
> > > > > About latency my best result occur with sync producers, but the
> > > > throughput
> > > > > is much lower in that case.
> > > > >
> > > > > About not flushing to disk I'm pretty sure that it's not an option
> in
> > > > kafka
> > > > > (correct me if I'm wrong)
> > > > >
> > > > > Regards,
> > > > > Alvaro Gareppe
> > > > >
> > > > > On Thu, Aug 13, 2015 at 12:59 PM, Yuheng Du <
> > yuheng.du.h...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Also, the latency results show no major difference when using
> ack=0
> > > or
> > > > > > ack=1. Why is that?
> > > > > >
> > > > > > On Thu, Aug 13, 2015 at 11:51 AM, Yuheng Du <
> > > yuheng.du.h...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > I am running an experiment where 92 producers is publishing
> data
> > > > into 6
> > > > > > > brokers and 10 consumer are reading online data simultaneously.
> > > > > > >
> > > > > > > How should I do to reduce the latency? Currently when I run the
> > > > > producer
> > > > > > > performance test the average latency is around 10s.
> > > > > > >
> > > > > > > Should I disable log.flush? How to do that? Thanks.
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Ing. Alvaro Gareppe
> > > > > agare...@gmail.com
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Ing. Alvaro Gareppe
> > > agare...@gmail.com
> > >
> >
>


Re: use page cache as much as possiblee

2015-08-14 Thread Yuheng Du
Thank you Kishore, I see that the end-to-end latency may not be reduced by
resetting the flush time manually.

But if the default flush.ms is Long.Max_Value, why I see the disk usage of
the brokers constantly increasing when the producer is pushing in data?
Should that be happen once a while? The os page cache usage should not be
reflected when using "watch df -h" command, am I correct?

Thanks.

On Fri, Aug 14, 2015 at 10:12 PM, Kishore Senji  wrote:

> Actually in 0.8.2, flush.ms & flush.messages are recommended to be left
> defaults (Long.MAX_VALUE)
> http://kafka.apache.org/documentation.html (search for flush.ms)
>
> The disk flush and the committed offset are two independent things. As long
> as you have replication, the recommended thing is to leave the flushing to
> the OS. But if you choose to flush manually the time interval at which you
> flush may not influence the end-to-end latency from the Producer to
> Consumer, however it can influence the throughput of the broker.
>
> On Fri, Aug 14, 2015 at 9:20 AM Yuheng Du 
> wrote:
>
> > So if I understand correctly, even if I delay flushing, the consumer will
> > get the messages as soon as the broker receives them and put them into
> page
> > cache (assuming producer doesn't wait for acks from brokers)?
> >
> > And will the decrease of log.flush interval help reduce latency between
> > producer and consumer?
> >
> > Thanks.
> >
> >
> > On Fri, Aug 14, 2015 at 11:57 AM, Kishore Senji 
> wrote:
> >
> > > Thank you Gwen for correcting me. This document (
> > > https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Replication)
> in
> > > "Writes" section also has specified the same thing as you have
> mentioned.
> > > One thing is not clear to me as to what happens when the Replicas add
> the
> > > message to memory but the leader fails before acking to the producer.
> > Later
> > > the leader replica is chosen to be the leader for the partition, it
> will
> > > advance the HW to its LEO (which has the message). The producer can
> > resend
> > > the same message thinking it failed and there will be a duplicate
> > message.
> > > Is my understanding correct here?
> > >
> > > On Thu, Aug 13, 2015 at 10:50 PM, Gwen Shapira 
> > wrote:
> > >
> > > > On Thu, Aug 13, 2015 at 4:10 PM, Kishore Senji 
> > wrote:
> > > >
> > > > > Consumers can only fetch data up to the committed offset and the
> > reason
> > > > is
> > > > > reliability and durability on a broker crash (some consumers might
> > get
> > > > the
> > > > > new data and some may not as the data is not yet committed and
> lost).
> > > > Data
> > > > > will be committed when it is flushed. So if you delay the flushing,
> > > > > consumers won't get those messages until that time.
> > > > >
> > > >
> > > > As far as I know, this is not accurate.
> > > >
> > > > A message is considered committed when all ISR replicas received it
> > (this
> > > > much is documented). This doesn't need to include writing to disk,
> > which
> > > > will happen asynchronously.
> > > >
> > > >
> > > > >
> > > > > Even though you flush periodically based on
> > log.flush.interval.messages
> > > > and
> > > > > log.flush.interval.ms, if the segment file is in the pagecache,
> the
> > > > > consumers will still benefit from that pagecache and OS wouldn't
> read
> > > it
> > > > > again from disk.
> > > > >
> > > > > On Thu, Aug 13, 2015 at 2:54 PM Yuheng Du <
> yuheng.du.h...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > As I understand it, kafka brokers will store the incoming
> messages
> > > into
> > > > > > pagecache as much as possible and then flush them into disk,
> right?
> > > > > >
> > > > > > But in my experiment where 90 producers is publishing data into 6
> > > > > brokers,
> > > > > > I see that the log directory on disk where broker stores the data
> > is
> > > > > > constantly increasing (every seconds.) So why this is happening?
> > Does
> > > > > this
> > > > > > has to do with the default "log.flush.interval" setting?
> > > > > >
> > > > > > I want the broker to write to disk less often when serving some
> > > on-line
> > > > > > consumers to reduce latency. I tested in my broker the disk write
> > > speed
> > > > > is
> > > > > > around 110MB/s.
> > > > > >
> > > > > > Thanks for any replies.
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: use page cache as much as possiblee

2015-08-14 Thread Yuheng Du
So if I understand correctly, even if I delay flushing, the consumer will
get the messages as soon as the broker receives them and put them into page
cache (assuming producer doesn't wait for acks from brokers)?

And will the decrease of log.flush interval help reduce latency between
producer and consumer?

Thanks.


On Fri, Aug 14, 2015 at 11:57 AM, Kishore Senji  wrote:

> Thank you Gwen for correcting me. This document (
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Replication) in
> "Writes" section also has specified the same thing as you have mentioned.
> One thing is not clear to me as to what happens when the Replicas add the
> message to memory but the leader fails before acking to the producer. Later
> the leader replica is chosen to be the leader for the partition, it will
> advance the HW to its LEO (which has the message). The producer can resend
> the same message thinking it failed and there will be a duplicate message.
> Is my understanding correct here?
>
> On Thu, Aug 13, 2015 at 10:50 PM, Gwen Shapira  wrote:
>
> > On Thu, Aug 13, 2015 at 4:10 PM, Kishore Senji  wrote:
> >
> > > Consumers can only fetch data up to the committed offset and the reason
> > is
> > > reliability and durability on a broker crash (some consumers might get
> > the
> > > new data and some may not as the data is not yet committed and lost).
> > Data
> > > will be committed when it is flushed. So if you delay the flushing,
> > > consumers won't get those messages until that time.
> > >
> >
> > As far as I know, this is not accurate.
> >
> > A message is considered committed when all ISR replicas received it (this
> > much is documented). This doesn't need to include writing to disk, which
> > will happen asynchronously.
> >
> >
> > >
> > > Even though you flush periodically based on log.flush.interval.messages
> > and
> > > log.flush.interval.ms, if the segment file is in the pagecache, the
> > > consumers will still benefit from that pagecache and OS wouldn't read
> it
> > > again from disk.
> > >
> > > On Thu, Aug 13, 2015 at 2:54 PM Yuheng Du 
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > As I understand it, kafka brokers will store the incoming messages
> into
> > > > pagecache as much as possible and then flush them into disk, right?
> > > >
> > > > But in my experiment where 90 producers is publishing data into 6
> > > brokers,
> > > > I see that the log directory on disk where broker stores the data is
> > > > constantly increasing (every seconds.) So why this is happening? Does
> > > this
> > > > has to do with the default "log.flush.interval" setting?
> > > >
> > > > I want the broker to write to disk less often when serving some
> on-line
> > > > consumers to reduce latency. I tested in my broker the disk write
> speed
> > > is
> > > > around 110MB/s.
> > > >
> > > > Thanks for any replies.
> > > >
> > >
> >
>


use page cache as much as possiblee

2015-08-13 Thread Yuheng Du
Hi,

As I understand it, kafka brokers will store the incoming messages into
pagecache as much as possible and then flush them into disk, right?

But in my experiment where 90 producers is publishing data into 6 brokers,
I see that the log directory on disk where broker stores the data is
constantly increasing (every seconds.) So why this is happening? Does this
has to do with the default "log.flush.interval" setting?

I want the broker to write to disk less often when serving some on-line
consumers to reduce latency. I tested in my broker the disk write speed is
around 110MB/s.

Thanks for any replies.


Re: Reduce latency

2015-08-13 Thread Yuheng Du
Yes there is. But if we are using ProducerPerformance test, it's configured
as giving input when running the test command. Do you write a java program
to test the latency? Thanks.

On Thu, Aug 13, 2015 at 3:54 PM, Alvaro Gareppe  wrote:

> I'm using last one, but not using the ProducerPerformance, I created my
> own. but I think there is a producer.properties file in config folder in
> kafka.. is that configuration not for this tester ?
>
> On Thu, Aug 13, 2015 at 4:18 PM, Yuheng Du 
> wrote:
>
> > Thank you Alvaro,
> >
> > How to use sync producers? I am running the standard ProducerPerformance
> > test from kafka to measure the latency of each message to send from
> > producer to broker only.
> > The command is like "bin/kafka-run-class.sh
> > org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1
> > acks=1 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092
> > buffer.memory=67108864 batch.size=8196"
> >
> > For running producers, where should I put the producer.type=sync
> > configuration into? The config/server.properties? Also Does this mean we
> > are using batch size of 1? Which version of Kafka are you using?
> > thanks.
> >
> > On Thu, Aug 13, 2015 at 3:01 PM, Alvaro Gareppe 
> > wrote:
> >
> > > Are you measuring latency as time between producer and consumer ?
> > >
> > > In that case, the ack shouldn't affect the latency, cause even tough
> your
> > > producer is not going to wait for the ack, the consumer will only get
> the
> > > message after its commited in the server.
> > >
> > > About latency my best result occur with sync producers, but the
> > throughput
> > > is much lower in that case.
> > >
> > > About not flushing to disk I'm pretty sure that it's not an option in
> > kafka
> > > (correct me if I'm wrong)
> > >
> > > Regards,
> > > Alvaro Gareppe
> > >
> > > On Thu, Aug 13, 2015 at 12:59 PM, Yuheng Du 
> > > wrote:
> > >
> > > > Also, the latency results show no major difference when using ack=0
> or
> > > > ack=1. Why is that?
> > > >
> > > > On Thu, Aug 13, 2015 at 11:51 AM, Yuheng Du <
> yuheng.du.h...@gmail.com>
> > > > wrote:
> > > >
> > > > > I am running an experiment where 92 producers is publishing data
> > into 6
> > > > > brokers and 10 consumer are reading online data simultaneously.
> > > > >
> > > > > How should I do to reduce the latency? Currently when I run the
> > > producer
> > > > > performance test the average latency is around 10s.
> > > > >
> > > > > Should I disable log.flush? How to do that? Thanks.
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Ing. Alvaro Gareppe
> > > agare...@gmail.com
> > >
> >
>
>
>
> --
> Ing. Alvaro Gareppe
> agare...@gmail.com
>


Re: Reduce latency

2015-08-13 Thread Yuheng Du
Thank you Alvaro,

How to use sync producers? I am running the standard ProducerPerformance
test from kafka to measure the latency of each message to send from
producer to broker only.
The command is like "bin/kafka-run-class.sh
org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1
acks=1 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092
buffer.memory=67108864 batch.size=8196"

For running producers, where should I put the producer.type=sync
configuration into? The config/server.properties? Also Does this mean we
are using batch size of 1? Which version of Kafka are you using?
thanks.

On Thu, Aug 13, 2015 at 3:01 PM, Alvaro Gareppe  wrote:

> Are you measuring latency as time between producer and consumer ?
>
> In that case, the ack shouldn't affect the latency, cause even tough your
> producer is not going to wait for the ack, the consumer will only get the
> message after its commited in the server.
>
> About latency my best result occur with sync producers, but the throughput
> is much lower in that case.
>
> About not flushing to disk I'm pretty sure that it's not an option in kafka
> (correct me if I'm wrong)
>
> Regards,
> Alvaro Gareppe
>
> On Thu, Aug 13, 2015 at 12:59 PM, Yuheng Du 
> wrote:
>
> > Also, the latency results show no major difference when using ack=0 or
> > ack=1. Why is that?
> >
> > On Thu, Aug 13, 2015 at 11:51 AM, Yuheng Du 
> > wrote:
> >
> > > I am running an experiment where 92 producers is publishing data into 6
> > > brokers and 10 consumer are reading online data simultaneously.
> > >
> > > How should I do to reduce the latency? Currently when I run the
> producer
> > > performance test the average latency is around 10s.
> > >
> > > Should I disable log.flush? How to do that? Thanks.
> > >
> >
>
>
>
> --
> Ing. Alvaro Gareppe
> agare...@gmail.com
>


Reduce latency

2015-08-13 Thread Yuheng Du
I am running an experiment where 92 producers is publishing data into 6
brokers and 10 consumer are reading online data simultaneously.

How should I do to reduce the latency? Currently when I run the producer
performance test the average latency is around 10s.

Should I disable log.flush? How to do that? Thanks.


Re: Reduce latency

2015-08-13 Thread Yuheng Du
Also, the latency results show no major difference when using ack=0 or
ack=1. Why is that?

On Thu, Aug 13, 2015 at 11:51 AM, Yuheng Du 
wrote:

> I am running an experiment where 92 producers is publishing data into 6
> brokers and 10 consumer are reading online data simultaneously.
>
> How should I do to reduce the latency? Currently when I run the producer
> performance test the average latency is around 10s.
>
> Should I disable log.flush? How to do that? Thanks.
>


Variation of producer latency in ProducerPerformance test

2015-08-11 Thread Yuheng Du
Hi,

I am running a test which 92 producers each publish 53000 records of size
254 bytes to 2 brokers.

The average latency shown in each producer has high variations. For some
producer, the average latency is as low as 38ms to send the 53000 records;
but for some producer, the average latency is as high as 13067ms.

How to explain this problem? To get the lowest latencies, what batch size
and other important configs should I use?

Thanks!


Kafka vs RabbitMQ latency

2015-08-04 Thread Yuheng Du
Hi guys,

I was reading a paper today in which the latency of kafka and rabbitmq is
compared:
http://downloads.hindawi.com/journals/js/2015/468047.pdf

To my surprise, kafka has shown some large variations of latency as the
number of records per second increases.

So I am curious about why is that. Also in the
ProducerPerformanceTest: in/kafka-run-class.sh
org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1
*acks=1* bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092
buffer.memory=67108864 batch.size=8196

Setting acks = 1 means the producer will wait for ack from leader replica,
right? Could that be the reason which affects latency? If I set it to 0, it
will make the producers send as fast as possible therefore the throughput
can increase and latency decrease in the test results?

Thanks for answering.

best,


Re: multiple producer throughput

2015-07-27 Thread Yuheng Du
The message size is 100 bytes and each producer sends out 50million
messages. It's the number used by the "benchmarking kafka" post.
http://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines

Thanks.

On Mon, Jul 27, 2015 at 4:15 PM, Prabhjot Bharaj 
wrote:

> Hi,
>
> Have you tried with acks=1 and -1 as well?
> Please share the numbers and the message size
>
> Regards,
> Prabcs
> On Jul 27, 2015 10:24 PM, "Yuheng Du"  wrote:
>
> > Hi,
> >
> > I am running 40 producers on 40 nodes cluster. The messages are sent to 6
> > brokers in another cluster. The producers are running ProducerPerformance
> > test.
> >
> > When 20 nodes are running, the throughput is around 13MB/s and when
> running
> > 40 nodes, the throughput is around 9MB/s.
> >
> > I have set log.retention.ms=9000 to delete the unwanted messages, just
> to
> > avoid the disk space to be filled.
> >
> > So I want to know how should I tune the system to get better throughput
> > result? Thanks.
> >
>


Re: deleting data automatically

2015-07-27 Thread Yuheng Du
Thank you!

On Mon, Jul 27, 2015 at 1:43 PM, Ewen Cheslack-Postava 
wrote:

> As I mentioned, adjusting any settings such that files are small enough
> that you don't get the benefits of append-only writes or file
> creation/deletion become a bottleneck might affect performance. It looks
> like the default setting for log.segment.bytes is 1GB, so given fast enough
> cleanup of old logs, you may not need to adjust that setting -- assuming
> you have a reasonable amount of storage, you'll easily fit many dozen log
> files of that size.
>
> -Ewen
>
> On Mon, Jul 27, 2015 at 10:36 AM, Yuheng Du 
> wrote:
>
> > Thank you! what performance impacts will it be if I change
> > log.segment.bytes? Thanks.
> >
> > On Mon, Jul 27, 2015 at 1:25 PM, Ewen Cheslack-Postava <
> e...@confluent.io>
> > wrote:
> >
> > > I think log.cleanup.interval.mins was removed in the first 0.8 release.
> > It
> > > sounds like you're looking at outdated docs. Search for
> > > log.retention.check.interval.ms here:
> > > http://kafka.apache.org/documentation.html
> > >
> > > As for setting the values too low hurting performance, I'd guess it's
> > > probably only an issue if you set them extremely small, such that file
> > > creation and cleanup become a bottleneck.
> > >
> > > -Ewen
> > >
> > > On Mon, Jul 27, 2015 at 10:03 AM, Yuheng Du 
> > > wrote:
> > >
> > > > If I want to get higher throughput, should I increase the
> > > > log.segment.bytes?
> > > >
> > > > I don't see log.retention.check.interval.ms, but there is
> > > > log.cleanup.interval.mins, is that what you mean?
> > > >
> > > > If I set log.roll.ms or log.cleanup.interval.mins too small, will it
> > > hurt
> > > > the throughput? Thanks.
> > > >
> > > > On Fri, Jul 24, 2015 at 11:03 PM, Ewen Cheslack-Postava <
> > > e...@confluent.io
> > > > >
> > > > wrote:
> > > >
> > > > > You'll want to set the log retention policy via
> > > > > log.retention.{ms,minutes,hours} or log.retention.bytes. If you
> want
> > > > really
> > > > > aggressive collection (e.g., on the order of seconds, as you
> > > specified),
> > > > > you might also need to adjust log.segment.bytes/log.roll.{ms,hours}
> > and
> > > > > log.retention.check.interval.ms.
> > > > >
> > > > > On Fri, Jul 24, 2015 at 12:49 PM, Yuheng Du <
> > yuheng.du.h...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I am testing the kafka producer performance. So I created a queue
> > and
> > > > > > writes a large amount of data to that queue.
> > > > > >
> > > > > > Is there a way to delete the data automatically after some time,
> > say
> > > > > > whenever the data size reaches 50GB or the retention time exceeds
> > 10
> > > > > > seconds, it will be deleted so my disk won't get filled and new
> > data
> > > > > can't
> > > > > > be written in?
> > > > > >
> > > > > > Thanks.!
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Thanks,
> > > > > Ewen
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Thanks,
> > > Ewen
> > >
> >
>
>
>
> --
> Thanks,
> Ewen
>


Re: deleting data automatically

2015-07-27 Thread Yuheng Du
Thank you! what performance impacts will it be if I change
log.segment.bytes? Thanks.

On Mon, Jul 27, 2015 at 1:25 PM, Ewen Cheslack-Postava 
wrote:

> I think log.cleanup.interval.mins was removed in the first 0.8 release. It
> sounds like you're looking at outdated docs. Search for
> log.retention.check.interval.ms here:
> http://kafka.apache.org/documentation.html
>
> As for setting the values too low hurting performance, I'd guess it's
> probably only an issue if you set them extremely small, such that file
> creation and cleanup become a bottleneck.
>
> -Ewen
>
> On Mon, Jul 27, 2015 at 10:03 AM, Yuheng Du 
> wrote:
>
> > If I want to get higher throughput, should I increase the
> > log.segment.bytes?
> >
> > I don't see log.retention.check.interval.ms, but there is
> > log.cleanup.interval.mins, is that what you mean?
> >
> > If I set log.roll.ms or log.cleanup.interval.mins too small, will it
> hurt
> > the throughput? Thanks.
> >
> > On Fri, Jul 24, 2015 at 11:03 PM, Ewen Cheslack-Postava <
> e...@confluent.io
> > >
> > wrote:
> >
> > > You'll want to set the log retention policy via
> > > log.retention.{ms,minutes,hours} or log.retention.bytes. If you want
> > really
> > > aggressive collection (e.g., on the order of seconds, as you
> specified),
> > > you might also need to adjust log.segment.bytes/log.roll.{ms,hours} and
> > > log.retention.check.interval.ms.
> > >
> > > On Fri, Jul 24, 2015 at 12:49 PM, Yuheng Du 
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I am testing the kafka producer performance. So I created a queue and
> > > > writes a large amount of data to that queue.
> > > >
> > > > Is there a way to delete the data automatically after some time, say
> > > > whenever the data size reaches 50GB or the retention time exceeds 10
> > > > seconds, it will be deleted so my disk won't get filled and new data
> > > can't
> > > > be written in?
> > > >
> > > > Thanks.!
> > > >
> > >
> > >
> > >
> > > --
> > > Thanks,
> > > Ewen
> > >
> >
>
>
>
> --
> Thanks,
> Ewen
>


Re: deleting data automatically

2015-07-27 Thread Yuheng Du
If I want to get higher throughput, should I increase the
log.segment.bytes?

I don't see log.retention.check.interval.ms, but there is
log.cleanup.interval.mins, is that what you mean?

If I set log.roll.ms or log.cleanup.interval.mins too small, will it hurt
the throughput? Thanks.

On Fri, Jul 24, 2015 at 11:03 PM, Ewen Cheslack-Postava 
wrote:

> You'll want to set the log retention policy via
> log.retention.{ms,minutes,hours} or log.retention.bytes. If you want really
> aggressive collection (e.g., on the order of seconds, as you specified),
> you might also need to adjust log.segment.bytes/log.roll.{ms,hours} and
> log.retention.check.interval.ms.
>
> On Fri, Jul 24, 2015 at 12:49 PM, Yuheng Du 
> wrote:
>
> > Hi,
> >
> > I am testing the kafka producer performance. So I created a queue and
> > writes a large amount of data to that queue.
> >
> > Is there a way to delete the data automatically after some time, say
> > whenever the data size reaches 50GB or the retention time exceeds 10
> > seconds, it will be deleted so my disk won't get filled and new data
> can't
> > be written in?
> >
> > Thanks.!
> >
>
>
>
> --
> Thanks,
> Ewen
>


multiple producer throughput

2015-07-27 Thread Yuheng Du
Hi,

I am running 40 producers on 40 nodes cluster. The messages are sent to 6
brokers in another cluster. The producers are running ProducerPerformance
test.

When 20 nodes are running, the throughput is around 13MB/s and when running
40 nodes, the throughput is around 9MB/s.

I have set log.retention.ms=9000 to delete the unwanted messages, just to
avoid the disk space to be filled.

So I want to know how should I tune the system to get better throughput
result? Thanks.


deleting data automatically

2015-07-24 Thread Yuheng Du
Hi,

I am testing the kafka producer performance. So I created a queue and
writes a large amount of data to that queue.

Is there a way to delete the data automatically after some time, say
whenever the data size reaches 50GB or the retention time exceeds 10
seconds, it will be deleted so my disk won't get filled and new data can't
be written in?

Thanks.!


Re: properducertest on multiple nodes

2015-07-24 Thread Yuheng Du
I deleted the queue and recreated it before I run the test. Things are
working after restart the broker cluster, thanks!

On Fri, Jul 24, 2015 at 12:06 PM, Gwen Shapira 
wrote:

> Does topic "speedx1" exist?
>
> On Fri, Jul 24, 2015 at 7:09 AM, Yuheng Du 
> wrote:
> > Hi,
> >
> > I am trying to run 20 performance test on 10 nodes using pbsdsh.
> >
> > The messages will send to a 6 brokers cluster. It seems to work for a
> > while. When I delete the test queue and rerun the test, the broker does
> not
> > seem to process incoming messages:
> >
> > [yuhengd@node1739 kafka_2.10-0.8.2.1]$ bin/kafka-run-class.sh
> > org.apache.kafka.clients.tools.ProducerPerformance speedx1 5000 100
> -1
> > acks=1 bootstrap.servers=130.127.133.72:9092 buffer.memory=67108864
> > batch.size=8196
> >
> > 1 records sent, 0.0 records/sec (0.00 MB/sec), 6.0 ms avg latency,
> > 6.0 max latency.
> >
> > org.apache.kafka.common.errors.TimeoutException: Failed to update
> metadata
> > after 79 ms.
> >
> >
> > Can anyone suggest what the problem is? Or what configuration should I
> > change? Thanks!
>


properducertest on multiple nodes

2015-07-24 Thread Yuheng Du
Hi,

I am trying to run 20 performance test on 10 nodes using pbsdsh.

The messages will send to a 6 brokers cluster. It seems to work for a
while. When I delete the test queue and rerun the test, the broker does not
seem to process incoming messages:

[yuhengd@node1739 kafka_2.10-0.8.2.1]$ bin/kafka-run-class.sh
org.apache.kafka.clients.tools.ProducerPerformance speedx1 5000 100 -1
acks=1 bootstrap.servers=130.127.133.72:9092 buffer.memory=67108864
batch.size=8196

1 records sent, 0.0 records/sec (0.00 MB/sec), 6.0 ms avg latency,
6.0 max latency.

org.apache.kafka.common.errors.TimeoutException: Failed to update metadata
after 79 ms.


Can anyone suggest what the problem is? Or what configuration should I
change? Thanks!


Re: broker data directory

2015-07-21 Thread Yuheng Du
Thank you, Nicolas!

On Tue, Jul 21, 2015 at 10:46 AM, Nicolas Phung 
wrote:

> Yes indeed.
>
> # A comma seperated list of directories under which to store log files
> log.dirs=/var/lib/kafka
>
> You can put several disk/partitions too.
>
> Regards,
>
> On Tue, Jul 21, 2015 at 4:37 PM, Yuheng Du 
> wrote:
>
> > Just wanna make sure, in server.properties, the configuration
> > log.dirs=/tmp/kafka-logs
> >
> > specifies the directory of where the log (data) stores, right?
> >
> > If I want the data to be saved elsewhere, this is the configuration I
> need
> > to change, right?
> >
> > Thanks for answering.
> >
> > best,
> >
>


broker data directory

2015-07-21 Thread Yuheng Du
Just wanna make sure, in server.properties, the configuration
log.dirs=/tmp/kafka-logs

specifies the directory of where the log (data) stores, right?

If I want the data to be saved elsewhere, this is the configuration I need
to change, right?

Thanks for answering.

best,


Re: latency performance test

2015-07-16 Thread Yuheng Du
Hi Ewen,

Thank you for your patient explaining. It is very helpful.

Can we assume that the long latency of ProducerPerformance comes from
queuing delay in the buffer and it is related to buffer size?

Thank you!

best,
Yuheng

On Thu, Jul 16, 2015 at 12:21 AM, Ewen Cheslack-Postava 
wrote:

> The tests are meant to evaluate different things and the way they send
> messages is the source of the difference.
>
> EndToEndLatency works with a single message at a time. It produces the
> message then waits for the consumer to receive it. This approach guarantees
> there is no delay due to queuing. The goal with this test is to evaluate
> the *minimum* latency.
>
> ProducerPerformance focuses on achieving maximum throughput. This means it
> will enqueue lots of records so it will always have more data to send (and
> can use batching to increase the throughput). Unlike EndToEndLatency, this
> means records may just sit in a queue on the producer for awhile because
> the maximum number of in flight requests has been reached and it needs to
> wait for responses for those requests. Since EndToEndLatency only ever has
> one record outstanding, it will never encounter this case.
>
> Batching itself doesn't increase the latency because it only occurs when
> the producer is either a) already unable to send messages anyway or b)
> linger.ms is greater than 0, but the tests use the default setting that
> doesn't linger at all.
>
> In your example for ProducerPerformance, you have 100 byte records and will
> buffer up to 64MB. Given the batch size of 8K and default producer settings
> of 5 in flight requests, you can roughly think of one round trip time
> handling 5 * 8K = 40K bytes of data. If your roundtrip is 1ms, then if your
> buffer is full at 64MB it will take you 64 MB / (40 KB/ms) = 1638ms = 1.6s.
> That means that the record that was added at the end of the buffer had to
> just sit in the buffer for 1.6s before it was sent off to the broker. And
> if your buffer is consistently full (which it should be for
> ProducerPerformance since it's sending as fast as it can), that means
> *every* record waits that long.
>
> Of course, these numbers are estimates, depend on my having used 1ms, but
> hopefully should make it clear why you can see relatively large latencies.
>
> -Ewen
>
>
> On Wed, Jul 15, 2015 at 1:38 AM, Yuheng Du 
> wrote:
>
> > Hi,
> >
> > I have run the end to end latency test and the producerPerformance test
> on
> > my kafka cluster according to
> > https://gist.github.com/jkreps/c7ddb4041ef62a900e6c
> >
> > In end to end latency test, the latency was around 2ms. In
> > producerperformance test, if use batch size 8196 to send 50,000,000
> > records:
> >
> > >bin/kafka-run-class.sh
> org.apache.kafka.clients.tools.ProducerPerformance
> > speedx1 5000 100 -1 acks=1 bootstrap.servers=192.168.1.1:9092
> > buffer.memory=67108864 batch.size=8196
> >
> >
> > The results show that max latency is 3617ms, avg latency 626.7ms. I wanna
> > know why the latency in producerperformance test is significantly larger
> > than end to end test? Is it because of batching? Are the definitons of
> > these two latencies different? I looked at the source code and I believe
> > the latency is measure for the producer.send() function to complete. So
> > does this latency includes transmission delay, transferring delay, and
> what
> > other components?
> >
> >
> > Thanks.
> >
> >
> > best,
> >
> > Yuheng
> >
>
>
>
> --
> Thanks,
> Ewen
>


Re: Latency test

2015-07-15 Thread Yuheng Du
Thank you, Tao!
On Jul 15, 2015 6:27 PM, "Tao Feng"  wrote:

> Sorry Yufeng, You  should change it in $KAFKA_HEAP_OPTS.
>
> On Wed, Jul 15, 2015 at 3:09 PM, Tao Feng  wrote:
>
> > Hi Yuheng,
> >
> > You could add the -Xmx1024m in
> > https://github.com/apache/kafka/blob/trunk/bin/kafka-run-class.sh
> > KAFKA_JVM_PERFORMANCE_OPTS.
> >
> >
> >
> > On Wed, Jul 15, 2015 at 12:51 AM, Yuheng Du 
> > wrote:
> >
> >> Tao,
> >>
> >> If I am running on the command line the following command
> >> >bin/kafka-run-class.sh kafka.tools.TestEndToEndLatency
> 192.168.1.3:9092
> >> 192.168.1.1:2181 speedx3 5000 100 1 -Xmx1024m
> >>
> >> It promped that it is not correct. So where should I put the -Xmx1024m
> >> option? Thanks.
> >>
> >>
> >>
> >> On Wed, Jul 15, 2015 at 3:44 AM, Tao Feng  wrote:
> >>
> >> > (Please correct me if I am wrong.) Based on  TestEndToEndLatency(
> >> >
> >> >
> >>
> https://github.com/apache/kafka/blob/trunk/core/src/test/scala/kafka/tools/TestEndToEndLatency.scala
> >> > ),
> >> > consumer_fetch_max_wait corresponds to fetch.wait.max.ms in consumer
> >> > config(
> >> > http://kafka.apache.org/documentation.html#consumerconfigs). The
> >> default
> >> > value listed at document is 100(ms).
> >> >
> >> > To add java heap space to jvm, put -Xmx$Size(max heap size) for your
> jvm
> >> > option.
> >> >
> >> > On Wed, Jul 15, 2015 at 12:29 AM, Yuheng Du  >
> >> > wrote:
> >> >
> >> > > Tao,
> >> > >
> >> > > Thanks. The example on
> >> > https://gist.github.com/jkreps/c7ddb4041ef62a900e6c
> >> > > is outdated already. The error message shows:
> >> > >
> >> > > USAGE: java kafka.tools.TestEndToEndLatency$ broker_list
> >> > zookeeper_connect
> >> > > topic num_messages consumer_fetch_max_wait producer_acks
> >> > >
> >> > >
> >> > > Can anyone helps me what should be put in consumer_fetch_max_wait?
> >> > Thanks.
> >> > >
> >> > > On Tue, Jul 14, 2015 at 5:21 PM, Tao Feng 
> >> wrote:
> >> > >
> >> > > > I think ProducerPerformance microbenchmark  only measure between
> >> client
> >> > > to
> >> > > > brokers(producer to brokers) and provide latency information.
> >> > > >
> >> > > > On Tue, Jul 14, 2015 at 11:05 AM, Yuheng Du <
> >> yuheng.du.h...@gmail.com>
> >> > > > wrote:
> >> > > >
> >> > > > > Currently, the latency test from kafka test the end to end
> latency
> >> > > > between
> >> > > > > producers and consumers.
> >> > > > >
> >> > > > > Is there  a way to test the producer to broker  and broker to
> >> > consumer
> >> > > > > delay seperately?
> >> > > > >
> >> > > > > Thanks.
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>


Re: kafka benchmark tests

2015-07-15 Thread Yuheng Du
Hi Geoffrey,

Thank you for your detailed explaining. They are really helpful.

I am thinking of going after the second way, since I have bare metal access
to all the nodes in the cluster, it's probably better to run real slave
machines instead of virtual machines. (correct me if I am wrong)

Each of my node has 256 G ram and 2T disk space, how large will the slave
machine virtual machine be and how much memory they will take?

Thank you!

best,
Yuheng


On Wed, Jul 15, 2015 at 4:19 PM, Geoffrey Anderson 
wrote:

> Hi Yuheng,
>
> Yes, you should be able to run on either mac or linux.
>
> The test cluster consists of a test-driver machine and some number of slave
> machines. Right now, there are roughly two ways to set up the slave
> machines:
>
> 1) Slave machines are virtual machines *on* the test-driver machine.
> 2) Slave machines are external to the test-driver machine.
>
> 1 is the simplest to set up, but yes it does require installation of the
> virtual machines on the test-driver machine.
>
> The installation of these machines is outlined in the quickstart I
> mentioned (here is a better link for the test README:
> https://github.com/confluentinc/kafka/tree/KAFKA-2276/tests).
>
> The tool we're using to bring up the slave virtual machines is called
> vagrant, so the "vagrant" steps in the quickstart are really telling you
> how to install the virtual machines.
>
> Hope that helps!
>
> Cheers,
> Geoff
>
>
>
>
> On Wed, Jul 15, 2015 at 12:13 PM, Yuheng Du 
> wrote:
>
> > Hi Geoffrey,
> >
> > Thank you for your helpful information. Do I have to install the virtual
> > machines? I am using Mac as the testdriver machine or I can use a linux
> > machine to run testdriver too.
> >
> > Thanks.
> >
> > best,
> > Yuheng
> >
> > On Wed, Jul 15, 2015 at 2:55 PM, Geoffrey Anderson 
> > wrote:
> >
> > > Hi Yuheng,
> > >
> > > Running these tests requires a tool we've created at Confluent called
> > > 'ducktape', which you need to install with the command:
> > > pip install ducktape==0.2.0
> > >
> > > Running the tests locally requires some setup (creation of virtual
> > machines
> > > etc.) which is outlined here:
> > >
> > >
> >
> https://github.com/apache/kafka/pull/70/files#diff-62f0ff60ede3b78b9c95624e2f61d6c1
> > > The instructions in the quickstart show you how to run the tests on
> > cluster
> > > of virtual machines (on a single host)
> > >
> > > Once you have a cluster up and running, you'll be able to run the test
> > > you're interested in:
> > > cd kafka/tests
> > > ducktape kafkatest/tests/benchmark_test.py
> > >
> > > Definitely keep us posted about which parts are difficult, annoying, or
> > > confusing about this process and we'll do our best to help.
> > >
> > > Thanks,
> > > Geoff
> > >
> > >
> > >
> > > On Wed, Jul 15, 2015 at 12:49 AM, Yuheng Du 
> > > wrote:
> > >
> > > > Jiefu,
> > > >
> > > > Have you tried to run benchmark_test.py? I ran it and it asks me for
> > the
> > > > ducktape.services.service
> > > >
> > > > yuhengdu@consumer0:/packages/kafka_2.10-0.8.2.1$ python
> > > benchmark_test.py
> > > >
> > > > Traceback (most recent call last):
> > > >
> > > >   File "benchmark_test.py", line 16, in 
> > > >
> > > > from ducktape.services.service import Service
> > > >
> > > > ImportError: No module named ducktape.services.service
> > > >
> > > >
> > > > Can you help me on getting it to work, Ewen? Thanks.
> > > >
> > > >
> > > > best,
> > > >
> > > > Yuheng
> > > >
> > > > On Tue, Jul 14, 2015 at 11:28 PM, Ewen Cheslack-Postava <
> > > e...@confluent.io
> > > > >
> > > > wrote:
> > > >
> > > > > @Jiefu, yes! The patch is functional, I think it's just waiting on
> a
> > > bit
> > > > of
> > > > > final review after the last round of changes. You can definitely
> use
> > it
> > > > for
> > > > > your own benchmarking, and we'd love to see patches for any
> > additional
> > > > > tests we missed in the first pass!
> > > > >
> > > > > -Ewen
> > > > >
> >

Re: kafka TestEndtoEndLatency

2015-07-15 Thread Yuheng Du
Thanks. Here is the source code snippet of EndtoEndLatency test:


for (i<- 0 until numMessages) { val begin = System.nanoTime producer.send(
new ProducerRecord[Array[Byte],Array[Byte]](topic, message)) val received =
iter.next val elapsed = System.nanoTime - begin // poor man's progress bar
if (i % 1000 == 0) println(i + "\t" + elapsed / 1000.0 / 1000.0) totalTime +
= elapsed latencies(i) = (elapsed / 1000 / 1000) }












iter.next reads a message using a consumer process. The
producer.send()function, is it blocking? I think the message should still
matters here, am I right?
In the source code, it is "val message = "hello there beautiful".getBytes"
is around 21 bytes.

I see that in this EndtoEndLatency test, no acks is involved, right?

Thanks.
Yuheng

On Wed, Jul 15, 2015 at 3:43 PM, Guozhang Wang  wrote:

> The end-to-end latency record the transferring of a message from producer
> to broker, then to consumer.
>
> I cannot remember the details not but I think the EndtoEndLatency test
> record the latency as average, hence it is small.
>
> Guozhang
>
> On Wed, Jul 15, 2015 at 12:28 PM, Yuheng Du 
> wrote:
>
> > Guozhang,
> >
> > Thank you for explaining. I see that in ProducerPerformance call back
> > functions were used to get the latency metrics.
> > For the TestEndtoEndLatency, does message size matter? What this
> end-to-end
> > latency comprise of, besides transferring a package from source to
> > destination (typically around 0.2 ms for 100bytes message)?
> >
> > Thanks.
> >
> > Yuheng
> >
> > On Wed, Jul 15, 2015 at 3:01 PM, Guozhang Wang 
> wrote:
> >
> > > Yuheng,
> > >
> > > Only TestEndtoEndLatency's number are end to end, for
> ProducerPerformance
> > > the latency is for the send-to-ack latency, which increases as batch
> size
> > > increases.
> > >
> > > Guozhang
> > >
> > > On Wed, Jul 15, 2015 at 11:36 AM, Yuheng Du 
> > > wrote:
> > >
> > > > In kafka performance tests https://gist.github.com/jkreps
> > > > /c7ddb4041ef62a900e6c
> > > >
> > > > The TestEndtoEndLatency results are typically around 2ms, while the
> > > > ProducerPerformance normally has "average latency"around several
> > hundres
> > > ms
> > > > when using batch size 8196.
> > > >
> > > > Are both results talking about end to end latency? and the
> > > > ProducerPerformance results' latency is just larger because of
> batching
> > > of
> > > > several messages?
> > > >
> > > > What is the message size of the TestEndtoEndLatency test uses?
> > > >
> > > > Thanks for any ideas.
> > > >
> > > > best,
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>
>
>
> --
> -- Guozhang
>


Re: kafka TestEndtoEndLatency

2015-07-15 Thread Yuheng Du
Guozhang,

Thank you for explaining. I see that in ProducerPerformance call back
functions were used to get the latency metrics.
For the TestEndtoEndLatency, does message size matter? What this end-to-end
latency comprise of, besides transferring a package from source to
destination (typically around 0.2 ms for 100bytes message)?

Thanks.

Yuheng

On Wed, Jul 15, 2015 at 3:01 PM, Guozhang Wang  wrote:

> Yuheng,
>
> Only TestEndtoEndLatency's number are end to end, for ProducerPerformance
> the latency is for the send-to-ack latency, which increases as batch size
> increases.
>
> Guozhang
>
> On Wed, Jul 15, 2015 at 11:36 AM, Yuheng Du 
> wrote:
>
> > In kafka performance tests https://gist.github.com/jkreps
> > /c7ddb4041ef62a900e6c
> >
> > The TestEndtoEndLatency results are typically around 2ms, while the
> > ProducerPerformance normally has "average latency"around several hundres
> ms
> > when using batch size 8196.
> >
> > Are both results talking about end to end latency? and the
> > ProducerPerformance results' latency is just larger because of batching
> of
> > several messages?
> >
> > What is the message size of the TestEndtoEndLatency test uses?
> >
> > Thanks for any ideas.
> >
> > best,
> >
>
>
>
> --
> -- Guozhang
>


Re: kafka benchmark tests

2015-07-15 Thread Yuheng Du
Hi Geoffrey,

Thank you for your helpful information. Do I have to install the virtual
machines? I am using Mac as the testdriver machine or I can use a linux
machine to run testdriver too.

Thanks.

best,
Yuheng

On Wed, Jul 15, 2015 at 2:55 PM, Geoffrey Anderson 
wrote:

> Hi Yuheng,
>
> Running these tests requires a tool we've created at Confluent called
> 'ducktape', which you need to install with the command:
> pip install ducktape==0.2.0
>
> Running the tests locally requires some setup (creation of virtual machines
> etc.) which is outlined here:
>
> https://github.com/apache/kafka/pull/70/files#diff-62f0ff60ede3b78b9c95624e2f61d6c1
> The instructions in the quickstart show you how to run the tests on cluster
> of virtual machines (on a single host)
>
> Once you have a cluster up and running, you'll be able to run the test
> you're interested in:
> cd kafka/tests
> ducktape kafkatest/tests/benchmark_test.py
>
> Definitely keep us posted about which parts are difficult, annoying, or
> confusing about this process and we'll do our best to help.
>
> Thanks,
> Geoff
>
>
>
> On Wed, Jul 15, 2015 at 12:49 AM, Yuheng Du 
> wrote:
>
> > Jiefu,
> >
> > Have you tried to run benchmark_test.py? I ran it and it asks me for the
> > ducktape.services.service
> >
> > yuhengdu@consumer0:/packages/kafka_2.10-0.8.2.1$ python
> benchmark_test.py
> >
> > Traceback (most recent call last):
> >
> >   File "benchmark_test.py", line 16, in 
> >
> > from ducktape.services.service import Service
> >
> > ImportError: No module named ducktape.services.service
> >
> >
> > Can you help me on getting it to work, Ewen? Thanks.
> >
> >
> > best,
> >
> > Yuheng
> >
> > On Tue, Jul 14, 2015 at 11:28 PM, Ewen Cheslack-Postava <
> e...@confluent.io
> > >
> > wrote:
> >
> > > @Jiefu, yes! The patch is functional, I think it's just waiting on a
> bit
> > of
> > > final review after the last round of changes. You can definitely use it
> > for
> > > your own benchmarking, and we'd love to see patches for any additional
> > > tests we missed in the first pass!
> > >
> > > -Ewen
> > >
> > > On Tue, Jul 14, 2015 at 10:53 AM, JIEFU GONG 
> wrote:
> > >
> > > > Yuheng,
> > > > I would recommend looking here:
> > > > http://kafka.apache.org/documentation.html#brokerconfigs and
> scrolling
> > > > down
> > > > to get a better understanding of the default settings and what they
> > mean
> > > --
> > > > it'll tell you what different options for acks does.
> > > >
> > > > Ewen,
> > > > Thank you immensely for your thoughts, they shed a lot of insight
> into
> > > the
> > > > issue. Though it is understandable that your specific results need to
> > be
> > > > verified, it seems that the KIP-25 patch is functional and I can use
> it
> > > for
> > > > my own benchmarking purposes? Is that correct? Thanks again!
> > > >
> > > > On Tue, Jul 14, 2015 at 8:22 AM, Yuheng Du  >
> > > > wrote:
> > > >
> > > > > Also, I guess setting the target throughput to -1 means let it be
> as
> > > high
> > > > > as possible?
> > > > >
> > > > > On Tue, Jul 14, 2015 at 10:36 AM, Yuheng Du <
> > yuheng.du.h...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Thanks. If I set the acks=1 in the producer config options in
> > > > > > bin/kafka-run-class.sh
> > > > org.apache.kafka.clients.tools.ProducerPerformance
> > > > > > test7 50000000 100 -1 acks=1 bootstrap.servers=
> > > > > > esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864
> > > > > batch.size=8196?
> > > > > >
> > > > > > Does that mean for each message generated at the producer, the
> > > producer
> > > > > > will wait until the broker sends the ack back, then send another
> > > > message?
> > > > > >
> > > > > > Thanks.
> > > > > >
> > > > > > Yuheng
> > > > > >
> > > > > > On Tue, Jul 14, 2015 at 10:06 AM, Manikumar Reddy <
> > > > ku...@nmsworks.co.in>
> > > > > > wrote:
> > > > > >
> > > > >

kafka TestEndtoEndLatency

2015-07-15 Thread Yuheng Du
In kafka performance tests https://gist.github.com/jkreps
/c7ddb4041ef62a900e6c

The TestEndtoEndLatency results are typically around 2ms, while the
ProducerPerformance normally has "average latency"around several hundres ms
when using batch size 8196.

Are both results talking about end to end latency? and the
ProducerPerformance results' latency is just larger because of batching of
several messages?

What is the message size of the TestEndtoEndLatency test uses?

Thanks for any ideas.

best,


Re: Latency test

2015-07-15 Thread Yuheng Du
I have run the end to end latency test and the producerPerformance test on
my kafka cluster according to
https://gist.github.com/jkreps/c7ddb4041ef62a900e6c

In end to end latency test, the latency was around 2ms. In
producerperformance test, if use batch size 8196 to send 50,000,000 records:

>bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance
speedx1 5000 100 -1 acks=1 bootstrap.servers=192.168.1.1:9092
buffer.memory=67108864 batch.size=8196


The results show that max latency is 3617ms, avg latency 626.7ms. I wanna
know why the latency in producerperformance test is significantly larger
than end to end test? Is it because of batching? Are the definitons of
these two latencies different? I looked at the source code and I believe
the latency is measure for the producer.send() function to complete. So
does this latency includes transmission delay, transferring delay, and what
other components?


Thanks.


best,

Yuheng

On Wed, Jul 15, 2015 at 3:51 AM, Yuheng Du  wrote:

> Tao,
>
> If I am running on the command line the following command
> >bin/kafka-run-class.sh kafka.tools.TestEndToEndLatency 192.168.1.3:9092
> 192.168.1.1:2181 speedx3 5000 100 1 -Xmx1024m
>
> It promped that it is not correct. So where should I put the -Xmx1024m
> option? Thanks.
>
>
>
> On Wed, Jul 15, 2015 at 3:44 AM, Tao Feng  wrote:
>
>> (Please correct me if I am wrong.) Based on  TestEndToEndLatency(
>>
>> https://github.com/apache/kafka/blob/trunk/core/src/test/scala/kafka/tools/TestEndToEndLatency.scala
>> ),
>> consumer_fetch_max_wait corresponds to fetch.wait.max.ms in consumer
>> config(
>> http://kafka.apache.org/documentation.html#consumerconfigs). The default
>> value listed at document is 100(ms).
>>
>> To add java heap space to jvm, put -Xmx$Size(max heap size) for your jvm
>> option.
>>
>> On Wed, Jul 15, 2015 at 12:29 AM, Yuheng Du 
>> wrote:
>>
>> > Tao,
>> >
>> > Thanks. The example on
>> https://gist.github.com/jkreps/c7ddb4041ef62a900e6c
>> > is outdated already. The error message shows:
>> >
>> > USAGE: java kafka.tools.TestEndToEndLatency$ broker_list
>> zookeeper_connect
>> > topic num_messages consumer_fetch_max_wait producer_acks
>> >
>> >
>> > Can anyone helps me what should be put in consumer_fetch_max_wait?
>> Thanks.
>> >
>> > On Tue, Jul 14, 2015 at 5:21 PM, Tao Feng  wrote:
>> >
>> > > I think ProducerPerformance microbenchmark  only measure between
>> client
>> > to
>> > > brokers(producer to brokers) and provide latency information.
>> > >
>> > > On Tue, Jul 14, 2015 at 11:05 AM, Yuheng Du > >
>> > > wrote:
>> > >
>> > > > Currently, the latency test from kafka test the end to end latency
>> > > between
>> > > > producers and consumers.
>> > > >
>> > > > Is there  a way to test the producer to broker  and broker to
>> consumer
>> > > > delay seperately?
>> > > >
>> > > > Thanks.
>> > > >
>> > >
>> >
>>
>
>


latency performance test

2015-07-15 Thread Yuheng Du
Hi,

I have run the end to end latency test and the producerPerformance test on
my kafka cluster according to
https://gist.github.com/jkreps/c7ddb4041ef62a900e6c

In end to end latency test, the latency was around 2ms. In
producerperformance test, if use batch size 8196 to send 50,000,000 records:

>bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance
speedx1 5000 100 -1 acks=1 bootstrap.servers=192.168.1.1:9092
buffer.memory=67108864 batch.size=8196


The results show that max latency is 3617ms, avg latency 626.7ms. I wanna
know why the latency in producerperformance test is significantly larger
than end to end test? Is it because of batching? Are the definitons of
these two latencies different? I looked at the source code and I believe
the latency is measure for the producer.send() function to complete. So
does this latency includes transmission delay, transferring delay, and what
other components?


Thanks.


best,

Yuheng


Re: kafka benchmark tests

2015-07-15 Thread Yuheng Du
Jiefu,

Have you tried to run benchmark_test.py? I ran it and it asks me for the
ducktape.services.service

yuhengdu@consumer0:/packages/kafka_2.10-0.8.2.1$ python benchmark_test.py

Traceback (most recent call last):

  File "benchmark_test.py", line 16, in 

from ducktape.services.service import Service

ImportError: No module named ducktape.services.service


Can you help me on getting it to work, Ewen? Thanks.


best,

Yuheng

On Tue, Jul 14, 2015 at 11:28 PM, Ewen Cheslack-Postava 
wrote:

> @Jiefu, yes! The patch is functional, I think it's just waiting on a bit of
> final review after the last round of changes. You can definitely use it for
> your own benchmarking, and we'd love to see patches for any additional
> tests we missed in the first pass!
>
> -Ewen
>
> On Tue, Jul 14, 2015 at 10:53 AM, JIEFU GONG  wrote:
>
> > Yuheng,
> > I would recommend looking here:
> > http://kafka.apache.org/documentation.html#brokerconfigs and scrolling
> > down
> > to get a better understanding of the default settings and what they mean
> --
> > it'll tell you what different options for acks does.
> >
> > Ewen,
> > Thank you immensely for your thoughts, they shed a lot of insight into
> the
> > issue. Though it is understandable that your specific results need to be
> > verified, it seems that the KIP-25 patch is functional and I can use it
> for
> > my own benchmarking purposes? Is that correct? Thanks again!
> >
> > On Tue, Jul 14, 2015 at 8:22 AM, Yuheng Du 
> > wrote:
> >
> > > Also, I guess setting the target throughput to -1 means let it be as
> high
> > > as possible?
> > >
> > > On Tue, Jul 14, 2015 at 10:36 AM, Yuheng Du 
> > > wrote:
> > >
> > > > Thanks. If I set the acks=1 in the producer config options in
> > > > bin/kafka-run-class.sh
> > org.apache.kafka.clients.tools.ProducerPerformance
> > > > test7 5000 100 -1 acks=1 bootstrap.servers=
> > > > esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864
> > > batch.size=8196?
> > > >
> > > > Does that mean for each message generated at the producer, the
> producer
> > > > will wait until the broker sends the ack back, then send another
> > message?
> > > >
> > > > Thanks.
> > > >
> > > > Yuheng
> > > >
> > > > On Tue, Jul 14, 2015 at 10:06 AM, Manikumar Reddy <
> > ku...@nmsworks.co.in>
> > > > wrote:
> > > >
> > > >> Yes, A list of  Kafka Server host/port pairs to use for establishing
> > the
> > > >> initial connection to the Kafka cluster
> > > >>
> > > >> https://kafka.apache.org/documentation.html#newproducerconfigs
> > > >>
> > > >> On Tue, Jul 14, 2015 at 7:29 PM, Yuheng Du <
> yuheng.du.h...@gmail.com>
> > > >> wrote:
> > > >>
> > > >> > Does anyone know what is bootstrap.servers=
> > > >> > esv4-hcl198.grid.linkedin.com:9092 means in the following test
> > > command:
> > > >> >
> > > >> > bin/kafka-run-class.sh
> > > >> org.apache.kafka.clients.tools.ProducerPerformance
> > > >> > test7 5000 100 -1 acks=1 bootstrap.servers=
> > > >> > esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864
> > > >> batch.size=8196?
> > > >> >
> > > >> > what is bootstrap.servers? Is it the kafka server that I am
> running
> > a
> > > >> test
> > > >> > at?
> > > >> >
> > > >> > Thanks.
> > > >> >
> > > >> > Yuheng
> > > >> >
> > > >> >
> > > >> >
> > > >> >
> > > >> > On Tue, Jul 14, 2015 at 12:18 AM, Ewen Cheslack-Postava <
> > > >> e...@confluent.io
> > > >> > >
> > > >> > wrote:
> > > >> >
> > > >> > > I implemented (nearly) the same basic set of tests in the system
> > > test
> > > >> > > framework we started at Confluent and that is going to move into
> > > >> Kafka --
> > > >> > > see the wip patch for KIP-25 here:
> > > >> > https://github.com/apache/kafka/pull/70
> > > >> > > In particular, that test is implemented in benchmark_test.py:
> > > >> > >
> > > >> > &g

Re: Latency test

2015-07-15 Thread Yuheng Du
Tao,

If I am running on the command line the following command
>bin/kafka-run-class.sh kafka.tools.TestEndToEndLatency 192.168.1.3:9092
192.168.1.1:2181 speedx3 5000 100 1 -Xmx1024m

It promped that it is not correct. So where should I put the -Xmx1024m
option? Thanks.



On Wed, Jul 15, 2015 at 3:44 AM, Tao Feng  wrote:

> (Please correct me if I am wrong.) Based on  TestEndToEndLatency(
>
> https://github.com/apache/kafka/blob/trunk/core/src/test/scala/kafka/tools/TestEndToEndLatency.scala
> ),
> consumer_fetch_max_wait corresponds to fetch.wait.max.ms in consumer
> config(
> http://kafka.apache.org/documentation.html#consumerconfigs). The default
> value listed at document is 100(ms).
>
> To add java heap space to jvm, put -Xmx$Size(max heap size) for your jvm
> option.
>
> On Wed, Jul 15, 2015 at 12:29 AM, Yuheng Du 
> wrote:
>
> > Tao,
> >
> > Thanks. The example on
> https://gist.github.com/jkreps/c7ddb4041ef62a900e6c
> > is outdated already. The error message shows:
> >
> > USAGE: java kafka.tools.TestEndToEndLatency$ broker_list
> zookeeper_connect
> > topic num_messages consumer_fetch_max_wait producer_acks
> >
> >
> > Can anyone helps me what should be put in consumer_fetch_max_wait?
> Thanks.
> >
> > On Tue, Jul 14, 2015 at 5:21 PM, Tao Feng  wrote:
> >
> > > I think ProducerPerformance microbenchmark  only measure between client
> > to
> > > brokers(producer to brokers) and provide latency information.
> > >
> > > On Tue, Jul 14, 2015 at 11:05 AM, Yuheng Du 
> > > wrote:
> > >
> > > > Currently, the latency test from kafka test the end to end latency
> > > between
> > > > producers and consumers.
> > > >
> > > > Is there  a way to test the producer to broker  and broker to
> consumer
> > > > delay seperately?
> > > >
> > > > Thanks.
> > > >
> > >
> >
>


Re: Latency test

2015-07-15 Thread Yuheng Du
I got java out of heap error when running end to end latency test:

yuhengdu@consumer0:/packages/kafka_2.10-0.8.2.1$ bin/kafka-run-class.sh
kafka.tools.TestEndToEndLatency 192.168.1.3:9092 192.168.1.1:2181 speedx3
5000 100 1

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

at kafka.tools.TestEndToEndLatency$.main(TestEndToEndLatency.scala:69)

at kafka.tools.TestEndToEndLatency.main(TestEndToEndLatency.scala)


What command should I do to add java heap space  to jvm? Thanks!


Yuheng

On Wed, Jul 15, 2015 at 3:29 AM, Yuheng Du  wrote:

> Tao,
>
> Thanks. The example on https://gist.github.com/jkreps/c7ddb4041ef62a900e6c
> is outdated already. The error message shows:
>
> USAGE: java kafka.tools.TestEndToEndLatency$ broker_list zookeeper_connect
> topic num_messages consumer_fetch_max_wait producer_acks
>
>
> Can anyone helps me what should be put in consumer_fetch_max_wait? Thanks.
>
> On Tue, Jul 14, 2015 at 5:21 PM, Tao Feng  wrote:
>
>> I think ProducerPerformance microbenchmark  only measure between client to
>> brokers(producer to brokers) and provide latency information.
>>
>> On Tue, Jul 14, 2015 at 11:05 AM, Yuheng Du 
>> wrote:
>>
>> > Currently, the latency test from kafka test the end to end latency
>> between
>> > producers and consumers.
>> >
>> > Is there  a way to test the producer to broker  and broker to consumer
>> > delay seperately?
>> >
>> > Thanks.
>> >
>>
>
>


Re: Latency test

2015-07-15 Thread Yuheng Du
Tao,

Thanks. The example on https://gist.github.com/jkreps/c7ddb4041ef62a900e6c
is outdated already. The error message shows:

USAGE: java kafka.tools.TestEndToEndLatency$ broker_list zookeeper_connect
topic num_messages consumer_fetch_max_wait producer_acks


Can anyone helps me what should be put in consumer_fetch_max_wait? Thanks.

On Tue, Jul 14, 2015 at 5:21 PM, Tao Feng  wrote:

> I think ProducerPerformance microbenchmark  only measure between client to
> brokers(producer to brokers) and provide latency information.
>
> On Tue, Jul 14, 2015 at 11:05 AM, Yuheng Du 
> wrote:
>
> > Currently, the latency test from kafka test the end to end latency
> between
> > producers and consumers.
> >
> > Is there  a way to test the producer to broker  and broker to consumer
> > delay seperately?
> >
> > Thanks.
> >
>


Re: How to run the three producers test

2015-07-14 Thread Yuheng Du
Jiefu,

Now even if the disk space is enough (less than 18%), when I run

it still gives me error where in the logs it says:

[2015-07-14 23:08:48,735] FATAL Fatal error during KafkaServerStartable
startup. Prepare to shutdown (kafka.server.KafkaServerStartable)

org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to
zookeeper server within timeout: 6000

at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:880)

at org.I0Itec.zkclient.ZkClient.(ZkClient.java:98)

at org.I0Itec.zkclient.ZkClient.(ZkClient.java:84)

at kafka.server.KafkaServer.initZk(KafkaServer.scala:157)

at kafka.server.KafkaServer.startup(KafkaServer.scala:82)

at
kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:29)

at kafka.Kafka$.main(Kafka.scala:46)

at kafka.Kafka.main(Kafka.scala)

[2015-07-14 23:08:48,737] INFO [Kafka Server 1], shutting down
(kafka.server.KafkaServer)

I have checked that the zookeeper is running fine. Can anyone help why I
got the error? Thanks.

On Tue, Jul 14, 2015 at 10:24 PM, Yuheng Du 
wrote:

> But is there a way to let kafka override the old data if the disk is
> filled? Or is it not necessary to use this figure? Thanks.
>
> On Tue, Jul 14, 2015 at 10:14 PM, Yuheng Du 
> wrote:
>
>> Jiefu,
>>
>> I agree with you. I checked the hardware specs of my machines, each one
>> of them has:
>>
>> RAM
>>
>>
>>
>> 256GB ECC Memory (16x 16 GB DDR4 1600MT/s dual rank RDIMMs
>>
>> Disk
>>
>>
>>
>> Two 1 TB 7.2K RPM 3G SATA HDDs
>>
>> For the throughput versus stored data test, it uses 5*10^10 messages,
>> which has the total volume of 5TB, I made the replication factor to be 3,
>> which means the total size including replicas would be 15TB, which
>> apparently overwhelmed the two brokers I use.
>>
>> Thanks.
>>
>> best,
>> Yuheng
>>
>> On Tue, Jul 14, 2015 at 6:06 PM, JIEFU GONG  wrote:
>>
>>> Someone may correct me if I am incorrect, but how much disk space do you
>>> have on these nodes? Your exception 'No space left on device' from one of
>>> your brokers seems to suggest that you're full (after all you're writing
>>> 500 million records). If this is the case I believe the expected behavior
>>> for Kafka is to reject any more attempts to write data?
>>>
>>> On Tue, Jul 14, 2015 at 2:27 PM, Yuheng Du 
>>> wrote:
>>>
>>> > Also, the log in another broker (not the bootstrap) says:
>>> >
>>> > [2015-07-14 15:18:41,220] FATAL [Replica Manager on Broker 1]: Error
>>> > writing to highwatermark file:  (kafka.server.ReplicaManager)
>>> >
>>> >
>>> > [2015-07-14 15:18:40,160] ERROR Closing socket for /130.127.133.47
>>> because
>>> > of error (kafka.network.Process
>>> >
>>> > or)
>>> >
>>> > java.io.IOException: Connection reset by peer
>>> >
>>> > at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>>> >
>>> > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>>> >
>>> > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>>> >
>>> > at sun.nio.ch.IOUtil.read(IOUtil.java:197)
>>> >
>>> > at
>>> sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
>>> >
>>> > at kafka.utils.Utils$.read(Utils.scala:380)
>>> >
>>> > at
>>> >
>>> >
>>> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
>>> >
>>> > at kafka.network.Processor.read(SocketServer.scala:444)
>>> >
>>> > at kafka.network.Processor.run(SocketServer.scala:340)
>>> >
>>> > at java.lang.Thread.run(Thread.java:745)
>>> >
>>> > 
>>> >
>>> > java.io.IOException: No space left on device
>>> >
>>> > at java.io.FileOutputStream.writeBytes(Native Method)
>>> >
>>> > at java.io.FileOutputStream.write(FileOutputStream.java:345)
>>> >
>>> > at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
>>> >
>>> > at sun.nio.cs.StreamEncoder.implFlushBuffe
>>> >
>>> > (END)
>>> >
>>> > On Tue, Jul 14, 2015 at 5:24 PM, Yuheng Du 
>>> > wrote:
>>> >
>>

Re: How to run the three producers test

2015-07-14 Thread Yuheng Du
But is there a way to let kafka override the old data if the disk is
filled? Or is it not necessary to use this figure? Thanks.

On Tue, Jul 14, 2015 at 10:14 PM, Yuheng Du 
wrote:

> Jiefu,
>
> I agree with you. I checked the hardware specs of my machines, each one of
> them has:
>
> RAM
>
>
>
> 256GB ECC Memory (16x 16 GB DDR4 1600MT/s dual rank RDIMMs
>
> Disk
>
>
>
> Two 1 TB 7.2K RPM 3G SATA HDDs
>
> For the throughput versus stored data test, it uses 5*10^10 messages,
> which has the total volume of 5TB, I made the replication factor to be 3,
> which means the total size including replicas would be 15TB, which
> apparently overwhelmed the two brokers I use.
>
> Thanks.
>
> best,
> Yuheng
>
> On Tue, Jul 14, 2015 at 6:06 PM, JIEFU GONG  wrote:
>
>> Someone may correct me if I am incorrect, but how much disk space do you
>> have on these nodes? Your exception 'No space left on device' from one of
>> your brokers seems to suggest that you're full (after all you're writing
>> 500 million records). If this is the case I believe the expected behavior
>> for Kafka is to reject any more attempts to write data?
>>
>> On Tue, Jul 14, 2015 at 2:27 PM, Yuheng Du 
>> wrote:
>>
>> > Also, the log in another broker (not the bootstrap) says:
>> >
>> > [2015-07-14 15:18:41,220] FATAL [Replica Manager on Broker 1]: Error
>> > writing to highwatermark file:  (kafka.server.ReplicaManager)
>> >
>> >
>> > [2015-07-14 15:18:40,160] ERROR Closing socket for /130.127.133.47
>> because
>> > of error (kafka.network.Process
>> >
>> > or)
>> >
>> > java.io.IOException: Connection reset by peer
>> >
>> > at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>> >
>> > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>> >
>> > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>> >
>> > at sun.nio.ch.IOUtil.read(IOUtil.java:197)
>> >
>> > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
>> >
>> > at kafka.utils.Utils$.read(Utils.scala:380)
>> >
>> > at
>> >
>> >
>> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
>> >
>> > at kafka.network.Processor.read(SocketServer.scala:444)
>> >
>> > at kafka.network.Processor.run(SocketServer.scala:340)
>> >
>> >         at java.lang.Thread.run(Thread.java:745)
>> >
>> > 
>> >
>> > java.io.IOException: No space left on device
>> >
>> > at java.io.FileOutputStream.writeBytes(Native Method)
>> >
>> > at java.io.FileOutputStream.write(FileOutputStream.java:345)
>> >
>> > at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
>> >
>> > at sun.nio.cs.StreamEncoder.implFlushBuffe
>> >
>> > (END)
>> >
>> > On Tue, Jul 14, 2015 at 5:24 PM, Yuheng Du 
>> > wrote:
>> >
>> > > Hi Jiefu, Gwen,
>> > >
>> > > I am running the Throughput versus stored data test:
>> > > bin/kafka-run-class.sh
>> org.apache.kafka.clients.tools.ProducerPerformance
>> > > test 500 100 -1 acks=1 bootstrap.servers=
>> > > esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864
>> > batch.size=8196
>> > >
>> > > After around 50,000,000 messages were sent, I got a bunch of
>> connection
>> > > refused error as I mentioned before. I checked the logs on the broker
>> and
>> > > here is what I see:
>> > >
>> > > [2015-07-14 15:11:23,578] WARN Partition [test,4] on broker 5: No
>> > > checkpointed highwatermark is found for partition [test,4]
>> > > (kafka.cluster.Partition)
>> > >
>> > > [2015-07-14 15:12:33,298] INFO Rolled new log segment for 'test-4' in
>> 4
>> > > ms. (kafka.log.Log)
>> > >
>> > > [2015-07-14 15:12:33,299] INFO Rolled new log segment for 'test-0' in
>> 1
>> > > ms. (kafka.log.Log)
>> > >
>> > > [2015-07-14 15:13:39,529] INFO Rolled new log segment for 'test-4' in
>> 1
>> > > ms. (kafka.log.Log)
>> > >
>> > > [2015-07-14 15:13:39,531] INFO Rolled new log segment for 'test-0' in
>> 1
>> > > ms.

Re: How to run the three producers test

2015-07-14 Thread Yuheng Du
Jiefu,

I agree with you. I checked the hardware specs of my machines, each one of
them has:

RAM



256GB ECC Memory (16x 16 GB DDR4 1600MT/s dual rank RDIMMs

Disk



Two 1 TB 7.2K RPM 3G SATA HDDs

For the throughput versus stored data test, it uses 5*10^10 messages, which
has the total volume of 5TB, I made the replication factor to be 3, which
means the total size including replicas would be 15TB, which apparently
overwhelmed the two brokers I use.

Thanks.

best,
Yuheng

On Tue, Jul 14, 2015 at 6:06 PM, JIEFU GONG  wrote:

> Someone may correct me if I am incorrect, but how much disk space do you
> have on these nodes? Your exception 'No space left on device' from one of
> your brokers seems to suggest that you're full (after all you're writing
> 500 million records). If this is the case I believe the expected behavior
> for Kafka is to reject any more attempts to write data?
>
> On Tue, Jul 14, 2015 at 2:27 PM, Yuheng Du 
> wrote:
>
> > Also, the log in another broker (not the bootstrap) says:
> >
> > [2015-07-14 15:18:41,220] FATAL [Replica Manager on Broker 1]: Error
> > writing to highwatermark file:  (kafka.server.ReplicaManager)
> >
> >
> > [2015-07-14 15:18:40,160] ERROR Closing socket for /130.127.133.47
> because
> > of error (kafka.network.Process
> >
> > or)
> >
> > java.io.IOException: Connection reset by peer
> >
> > at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> >
> > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> >
> > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> >
> > at sun.nio.ch.IOUtil.read(IOUtil.java:197)
> >
> > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
> >
> > at kafka.utils.Utils$.read(Utils.scala:380)
> >
> > at
> >
> >
> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
> >
> > at kafka.network.Processor.read(SocketServer.scala:444)
> >
> > at kafka.network.Processor.run(SocketServer.scala:340)
> >
> > at java.lang.Thread.run(Thread.java:745)
> >
> > 
> >
> > java.io.IOException: No space left on device
> >
> > at java.io.FileOutputStream.writeBytes(Native Method)
> >
> > at java.io.FileOutputStream.write(FileOutputStream.java:345)
> >
> > at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
> >
> > at sun.nio.cs.StreamEncoder.implFlushBuffe
> >
> > (END)
> >
> > On Tue, Jul 14, 2015 at 5:24 PM, Yuheng Du 
> > wrote:
> >
> > > Hi Jiefu, Gwen,
> > >
> > > I am running the Throughput versus stored data test:
> > > bin/kafka-run-class.sh
> org.apache.kafka.clients.tools.ProducerPerformance
> > > test 500 100 -1 acks=1 bootstrap.servers=
> > > esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864
> > batch.size=8196
> > >
> > > After around 50,000,000 messages were sent, I got a bunch of connection
> > > refused error as I mentioned before. I checked the logs on the broker
> and
> > > here is what I see:
> > >
> > > [2015-07-14 15:11:23,578] WARN Partition [test,4] on broker 5: No
> > > checkpointed highwatermark is found for partition [test,4]
> > > (kafka.cluster.Partition)
> > >
> > > [2015-07-14 15:12:33,298] INFO Rolled new log segment for 'test-4' in 4
> > > ms. (kafka.log.Log)
> > >
> > > [2015-07-14 15:12:33,299] INFO Rolled new log segment for 'test-0' in 1
> > > ms. (kafka.log.Log)
> > >
> > > [2015-07-14 15:13:39,529] INFO Rolled new log segment for 'test-4' in 1
> > > ms. (kafka.log.Log)
> > >
> > > [2015-07-14 15:13:39,531] INFO Rolled new log segment for 'test-0' in 1
> > > ms. (kafka.log.Log)
> > >
> > > [2015-07-14 15:14:48,502] INFO Rolled new log segment for 'test-4' in 3
> > > ms. (kafka.log.Log)
> > >
> > > [2015-07-14 15:14:48,502] INFO Rolled new log segment for 'test-0' in 1
> > > ms. (kafka.log.Log)
> > >
> > > [2015-07-14 15:15:51,478] INFO Rolled new log segment for 'test-4' in 1
> > > ms. (kafka.log.Log)
> > >
> > > [2015-07-14 15:15:51,479] INFO Rolled new log segment for 'test-0' in 1
> > > ms. (kafka.log.Log)
> > >
> > > [2015-07-14 15:16:52,589] INFO Rolled new log segment for 'te

Re: How to run the three producers test

2015-07-14 Thread Yuheng Du
Also, the log in another broker (not the bootstrap) says:

[2015-07-14 15:18:41,220] FATAL [Replica Manager on Broker 1]: Error
writing to highwatermark file:  (kafka.server.ReplicaManager)


[2015-07-14 15:18:40,160] ERROR Closing socket for /130.127.133.47 because
of error (kafka.network.Process

or)

java.io.IOException: Connection reset by peer

at sun.nio.ch.FileDispatcherImpl.read0(Native Method)

at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)

at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)

at sun.nio.ch.IOUtil.read(IOUtil.java:197)

at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)

at kafka.utils.Utils$.read(Utils.scala:380)

at
kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)

at kafka.network.Processor.read(SocketServer.scala:444)

at kafka.network.Processor.run(SocketServer.scala:340)

at java.lang.Thread.run(Thread.java:745)



java.io.IOException: No space left on device

at java.io.FileOutputStream.writeBytes(Native Method)

at java.io.FileOutputStream.write(FileOutputStream.java:345)

at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)

at sun.nio.cs.StreamEncoder.implFlushBuffe

(END)

On Tue, Jul 14, 2015 at 5:24 PM, Yuheng Du  wrote:

> Hi Jiefu, Gwen,
>
> I am running the Throughput versus stored data test:
> bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance
> test 500 100 -1 acks=1 bootstrap.servers=
> esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196
>
> After around 50,000,000 messages were sent, I got a bunch of connection
> refused error as I mentioned before. I checked the logs on the broker and
> here is what I see:
>
> [2015-07-14 15:11:23,578] WARN Partition [test,4] on broker 5: No
> checkpointed highwatermark is found for partition [test,4]
> (kafka.cluster.Partition)
>
> [2015-07-14 15:12:33,298] INFO Rolled new log segment for 'test-4' in 4
> ms. (kafka.log.Log)
>
> [2015-07-14 15:12:33,299] INFO Rolled new log segment for 'test-0' in 1
> ms. (kafka.log.Log)
>
> [2015-07-14 15:13:39,529] INFO Rolled new log segment for 'test-4' in 1
> ms. (kafka.log.Log)
>
> [2015-07-14 15:13:39,531] INFO Rolled new log segment for 'test-0' in 1
> ms. (kafka.log.Log)
>
> [2015-07-14 15:14:48,502] INFO Rolled new log segment for 'test-4' in 3
> ms. (kafka.log.Log)
>
> [2015-07-14 15:14:48,502] INFO Rolled new log segment for 'test-0' in 1
> ms. (kafka.log.Log)
>
> [2015-07-14 15:15:51,478] INFO Rolled new log segment for 'test-4' in 1
> ms. (kafka.log.Log)
>
> [2015-07-14 15:15:51,479] INFO Rolled new log segment for 'test-0' in 1
> ms. (kafka.log.Log)
>
> [2015-07-14 15:16:52,589] INFO Rolled new log segment for 'test-4' in 1
> ms. (kafka.log.Log)
>
> [2015-07-14 15:16:52,590] INFO Rolled new log segment for 'test-0' in 1
> ms. (kafka.log.Log)
>
> [2015-07-14 15:17:57,406] INFO Rolled new log segment for 'test-4' in 1
> ms. (kafka.log.Log)
>
> [2015-07-14 15:17:57,407] INFO Rolled new log segment for 'test-0' in 0
> ms. (kafka.log.Log)
>
> [2015-07-14 15:18:39,792] FATAL [KafkaApi-5] Halting due to unrecoverable
> I/O error while handling produce request:  (kafka.server.KafkaApis)
>
> kafka.common.KafkaStorageException: I/O exception in append to log 'test-0'
>
> at kafka.log.Log.append(Log.scala:266)
>
> at
> kafka.cluster.Partition$$anonfun$appendMessagesToLeader$1.apply(Partition.scala:379)
>
> at
> kafka.cluster.Partition$$anonfun$appendMessagesToLeader$1.apply(Partition.scala:365)
>
> at kafka.utils.Utils$.inLock(Utils.scala:535)
>
> at kafka.utils.Utils$.inReadLock(Utils.scala:541)
>
> at
> kafka.cluster.Partition.appendMessagesToLeader(Partition.scala:365)
>
> at
> kafka.server.KafkaApis$$anonfun$appendToLocalLog$2.apply(KafkaApis.scala:291)
>
> at
> kafka.server.KafkaApis$$anonfun$appendToLocalLog$2.apply(KafkaApis.scala:282)
>
> at
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>
> at
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>
> at
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
>
> at
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
>
> at scala.coll
>
>
>
> Can you help me with this problem? Thanks.
>
> On Tue, Jul 14, 2015 at 5:12 PM, Yuheng Du 
> wrote:
>
>> 

Re: How to run the three producers test

2015-07-14 Thread Yuheng Du
Hi Jiefu, Gwen,

I am running the Throughput versus stored data test:
bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance
test 500 100 -1 acks=1 bootstrap.servers=
esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196

After around 50,000,000 messages were sent, I got a bunch of connection
refused error as I mentioned before. I checked the logs on the broker and
here is what I see:

[2015-07-14 15:11:23,578] WARN Partition [test,4] on broker 5: No
checkpointed highwatermark is found for partition [test,4]
(kafka.cluster.Partition)

[2015-07-14 15:12:33,298] INFO Rolled new log segment for 'test-4' in 4 ms.
(kafka.log.Log)

[2015-07-14 15:12:33,299] INFO Rolled new log segment for 'test-0' in 1 ms.
(kafka.log.Log)

[2015-07-14 15:13:39,529] INFO Rolled new log segment for 'test-4' in 1 ms.
(kafka.log.Log)

[2015-07-14 15:13:39,531] INFO Rolled new log segment for 'test-0' in 1 ms.
(kafka.log.Log)

[2015-07-14 15:14:48,502] INFO Rolled new log segment for 'test-4' in 3 ms.
(kafka.log.Log)

[2015-07-14 15:14:48,502] INFO Rolled new log segment for 'test-0' in 1 ms.
(kafka.log.Log)

[2015-07-14 15:15:51,478] INFO Rolled new log segment for 'test-4' in 1 ms.
(kafka.log.Log)

[2015-07-14 15:15:51,479] INFO Rolled new log segment for 'test-0' in 1 ms.
(kafka.log.Log)

[2015-07-14 15:16:52,589] INFO Rolled new log segment for 'test-4' in 1 ms.
(kafka.log.Log)

[2015-07-14 15:16:52,590] INFO Rolled new log segment for 'test-0' in 1 ms.
(kafka.log.Log)

[2015-07-14 15:17:57,406] INFO Rolled new log segment for 'test-4' in 1 ms.
(kafka.log.Log)

[2015-07-14 15:17:57,407] INFO Rolled new log segment for 'test-0' in 0 ms.
(kafka.log.Log)

[2015-07-14 15:18:39,792] FATAL [KafkaApi-5] Halting due to unrecoverable
I/O error while handling produce request:  (kafka.server.KafkaApis)

kafka.common.KafkaStorageException: I/O exception in append to log 'test-0'

at kafka.log.Log.append(Log.scala:266)

at
kafka.cluster.Partition$$anonfun$appendMessagesToLeader$1.apply(Partition.scala:379)

at
kafka.cluster.Partition$$anonfun$appendMessagesToLeader$1.apply(Partition.scala:365)

at kafka.utils.Utils$.inLock(Utils.scala:535)

at kafka.utils.Utils$.inReadLock(Utils.scala:541)

at
kafka.cluster.Partition.appendMessagesToLeader(Partition.scala:365)

at
kafka.server.KafkaApis$$anonfun$appendToLocalLog$2.apply(KafkaApis.scala:291)

at
kafka.server.KafkaApis$$anonfun$appendToLocalLog$2.apply(KafkaApis.scala:282)

at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)

at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)

at
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)

at
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)

at scala.coll



Can you help me with this problem? Thanks.

On Tue, Jul 14, 2015 at 5:12 PM, Yuheng Du  wrote:

> I checked the logs on the brokers, it seems that the zookeeper or the
> kafka server process is not running on this broker...Thank you guys. I will
> see if it happens again.
>
> On Tue, Jul 14, 2015 at 4:53 PM, JIEFU GONG  wrote:
>
>> Hmm..yeah some error logs would be nice like Gwen pointed out. Do any of
>> your brokers fall out of the ISR when sending messages? It seems like your
>> setup should be fine, so I'm not entirely sure.
>>
>> On Tue, Jul 14, 2015 at 1:31 PM, Yuheng Du 
>> wrote:
>>
>> > Jiefu,
>> >
>> > I am performing these tests on a 6 nodes cluster in cloudlab (a
>> > infrastructure built for scientific research). I use 2 nodes as
>> producers,
>> > 2 as brokers only, and 2 as consumers. I have tested for each individual
>> > machines and they work well. I did not use AWS. Thank you!
>> >
>> > On Tue, Jul 14, 2015 at 4:20 PM, JIEFU GONG  wrote:
>> >
>> > > Yuheng, are you performing these tests locally or using a service
>> such as
>> > > AWS? I'd try using each separate machine individually first,
>> connecting
>> > to
>> > > the ZK/Kafka servers and ensuring that each is able to first log and
>> > > consume messages independently.
>> > >
>> > > On Tue, Jul 14, 2015 at 1:17 PM, Gwen Shapira 
>> > > wrote:
>> > >
>> > > > Are there any errors on the broker logs?
>> > > >
>> > > > On Tue, Jul 14, 2015 at 11:57 AM, Yuheng Du <
>> yuheng.du.h...@gmail.com>
>> > > > wrote:
>> > > > > Jiefu,
>> > > > >
>> > > >

Re: How to run the three producers test

2015-07-14 Thread Yuheng Du
I checked the logs on the brokers, it seems that the zookeeper or the kafka
server process is not running on this broker...Thank you guys. I will see
if it happens again.

On Tue, Jul 14, 2015 at 4:53 PM, JIEFU GONG  wrote:

> Hmm..yeah some error logs would be nice like Gwen pointed out. Do any of
> your brokers fall out of the ISR when sending messages? It seems like your
> setup should be fine, so I'm not entirely sure.
>
> On Tue, Jul 14, 2015 at 1:31 PM, Yuheng Du 
> wrote:
>
> > Jiefu,
> >
> > I am performing these tests on a 6 nodes cluster in cloudlab (a
> > infrastructure built for scientific research). I use 2 nodes as
> producers,
> > 2 as brokers only, and 2 as consumers. I have tested for each individual
> > machines and they work well. I did not use AWS. Thank you!
> >
> > On Tue, Jul 14, 2015 at 4:20 PM, JIEFU GONG  wrote:
> >
> > > Yuheng, are you performing these tests locally or using a service such
> as
> > > AWS? I'd try using each separate machine individually first, connecting
> > to
> > > the ZK/Kafka servers and ensuring that each is able to first log and
> > > consume messages independently.
> > >
> > > On Tue, Jul 14, 2015 at 1:17 PM, Gwen Shapira 
> > > wrote:
> > >
> > > > Are there any errors on the broker logs?
> > > >
> > > > On Tue, Jul 14, 2015 at 11:57 AM, Yuheng Du <
> yuheng.du.h...@gmail.com>
> > > > wrote:
> > > > > Jiefu,
> > > > >
> > > > > Thank you. The three producers can run at the same time. I mean
> > should
> > > > they
> > > > > be started at exactly the same time? (I have three consoles from
> each
> > > of
> > > > > the three machines and I just start the console command manually
> one
> > by
> > > > > one) Or a small variation of the starting time won't matter?
> > > > >
> > > > > Gwen and Jiefu,
> > > > >
> > > > > I have started the three producers at three machines, after a
> while,
> > > all
> > > > of
> > > > > them gives a java.net.ConnectException:
> > > > >
> > > > > [2015-07-14 12:56:46,352] WARN Error in I/O with producer0-link-0/
> > > > > 192.168.1.1 (org.apache.kafka.common.network.Selector)
> > > > >
> > > > > java.net.ConnectException: Connection refused..
> > > > >
> > > > > [2015-07-14 12:56:48,056] WARN Error in I/O with producer1-link-0/
> > > > > 192.168.1.2 (org.apache.kafka.common.network.Selector)
> > > > >
> > > > > java.net.ConnectException: Connection refused.
> > > > >
> > > > > What could be the cause?
> > > > >
> > > > > Thank you guys!
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Jul 14, 2015 at 2:47 PM, JIEFU GONG 
> > > wrote:
> > > > >
> > > > >> Yuheng,
> > > > >>
> > > > >> Yes, if you read the blog post it specifies that he's using three
> > > > separate
> > > > >> machines. There's no reason the producers cannot be started at the
> > > same
> > > > >> time, I believe.
> > > > >>
> > > > >> On Tue, Jul 14, 2015 at 11:42 AM, Yuheng Du <
> > yuheng.du.h...@gmail.com
> > > >
> > > > >> wrote:
> > > > >>
> > > > >> > Hi,
> > > > >> >
> > > > >> > I am running the performance test for kafka.
> > > > >> > https://gist.github.com/jkreps
> > > > >> > /c7ddb4041ef62a900e6c
> > > > >> >
> > > > >> > For the "Three Producers, 3x async replication" scenario, the
> > > command
> > > > is
> > > > >> > the same as one producer:
> > > > >> >
> > > > >> > bin/kafka-run-class.sh
> > > > org.apache.kafka.clients.tools.ProducerPerformance
> > > > >> > test 5000 100 -1 acks=1
> > > > >> > bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092
> > > > >> > buffer.memory=67108864 batch.size=8196
> > > > >> >
> > > > >> > So How to I run the test for three producers? Do I just run them
> > on
> > > > three
> > > > >> > separate servers at the same time? Will there be some error in
> > this
> > > > way
> > > > >> > since the three producers can't be started at the same time?
> > > > >> >
> > > > >> > Thanks.
> > > > >> >
> > > > >> > best,
> > > > >> > Yuheng
> > > > >> >
> > > > >>
> > > > >>
> > > > >>
> > > > >> --
> > > > >>
> > > > >> Jiefu Gong
> > > > >> University of California, Berkeley | Class of 2017
> > > > >> B.A Computer Science | College of Letters and Sciences
> > > > >>
> > > > >> jg...@berkeley.edu  | (925) 400-3427
> > > > >>
> > > >
> > >
> > >
> > >
> > > --
> > >
> > > Jiefu Gong
> > > University of California, Berkeley | Class of 2017
> > > B.A Computer Science | College of Letters and Sciences
> > >
> > > jg...@berkeley.edu  | (925) 400-3427
> > >
> >
>
>
>
> --
>
> Jiefu Gong
> University of California, Berkeley | Class of 2017
> B.A Computer Science | College of Letters and Sciences
>
> jg...@berkeley.edu  | (925) 400-3427
>


Re: How to run the three producers test

2015-07-14 Thread Yuheng Du
Jiefu,

I am performing these tests on a 6 nodes cluster in cloudlab (a
infrastructure built for scientific research). I use 2 nodes as producers,
2 as brokers only, and 2 as consumers. I have tested for each individual
machines and they work well. I did not use AWS. Thank you!

On Tue, Jul 14, 2015 at 4:20 PM, JIEFU GONG  wrote:

> Yuheng, are you performing these tests locally or using a service such as
> AWS? I'd try using each separate machine individually first, connecting to
> the ZK/Kafka servers and ensuring that each is able to first log and
> consume messages independently.
>
> On Tue, Jul 14, 2015 at 1:17 PM, Gwen Shapira 
> wrote:
>
> > Are there any errors on the broker logs?
> >
> > On Tue, Jul 14, 2015 at 11:57 AM, Yuheng Du 
> > wrote:
> > > Jiefu,
> > >
> > > Thank you. The three producers can run at the same time. I mean should
> > they
> > > be started at exactly the same time? (I have three consoles from each
> of
> > > the three machines and I just start the console command manually one by
> > > one) Or a small variation of the starting time won't matter?
> > >
> > > Gwen and Jiefu,
> > >
> > > I have started the three producers at three machines, after a while,
> all
> > of
> > > them gives a java.net.ConnectException:
> > >
> > > [2015-07-14 12:56:46,352] WARN Error in I/O with producer0-link-0/
> > > 192.168.1.1 (org.apache.kafka.common.network.Selector)
> > >
> > > java.net.ConnectException: Connection refused..
> > >
> > > [2015-07-14 12:56:48,056] WARN Error in I/O with producer1-link-0/
> > > 192.168.1.2 (org.apache.kafka.common.network.Selector)
> > >
> > > java.net.ConnectException: Connection refused.
> > >
> > > What could be the cause?
> > >
> > > Thank you guys!
> > >
> > >
> > >
> > >
> > > On Tue, Jul 14, 2015 at 2:47 PM, JIEFU GONG 
> wrote:
> > >
> > >> Yuheng,
> > >>
> > >> Yes, if you read the blog post it specifies that he's using three
> > separate
> > >> machines. There's no reason the producers cannot be started at the
> same
> > >> time, I believe.
> > >>
> > >> On Tue, Jul 14, 2015 at 11:42 AM, Yuheng Du  >
> > >> wrote:
> > >>
> > >> > Hi,
> > >> >
> > >> > I am running the performance test for kafka.
> > >> > https://gist.github.com/jkreps
> > >> > /c7ddb4041ef62a900e6c
> > >> >
> > >> > For the "Three Producers, 3x async replication" scenario, the
> command
> > is
> > >> > the same as one producer:
> > >> >
> > >> > bin/kafka-run-class.sh
> > org.apache.kafka.clients.tools.ProducerPerformance
> > >> > test 5000 100 -1 acks=1
> > >> > bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092
> > >> > buffer.memory=67108864 batch.size=8196
> > >> >
> > >> > So How to I run the test for three producers? Do I just run them on
> > three
> > >> > separate servers at the same time? Will there be some error in this
> > way
> > >> > since the three producers can't be started at the same time?
> > >> >
> > >> > Thanks.
> > >> >
> > >> > best,
> > >> > Yuheng
> > >> >
> > >>
> > >>
> > >>
> > >> --
> > >>
> > >> Jiefu Gong
> > >> University of California, Berkeley | Class of 2017
> > >> B.A Computer Science | College of Letters and Sciences
> > >>
> > >> jg...@berkeley.edu  | (925) 400-3427
> > >>
> >
>
>
>
> --
>
> Jiefu Gong
> University of California, Berkeley | Class of 2017
> B.A Computer Science | College of Letters and Sciences
>
> jg...@berkeley.edu  | (925) 400-3427
>


Re: How to run the three producers test

2015-07-14 Thread Yuheng Du
Jiefu,

Thank you. The three producers can run at the same time. I mean should they
be started at exactly the same time? (I have three consoles from each of
the three machines and I just start the console command manually one by
one) Or a small variation of the starting time won't matter?

Gwen and Jiefu,

I have started the three producers at three machines, after a while, all of
them gives a java.net.ConnectException:

[2015-07-14 12:56:46,352] WARN Error in I/O with producer0-link-0/
192.168.1.1 (org.apache.kafka.common.network.Selector)

java.net.ConnectException: Connection refused..

[2015-07-14 12:56:48,056] WARN Error in I/O with producer1-link-0/
192.168.1.2 (org.apache.kafka.common.network.Selector)

java.net.ConnectException: Connection refused.

What could be the cause?

Thank you guys!




On Tue, Jul 14, 2015 at 2:47 PM, JIEFU GONG  wrote:

> Yuheng,
>
> Yes, if you read the blog post it specifies that he's using three separate
> machines. There's no reason the producers cannot be started at the same
> time, I believe.
>
> On Tue, Jul 14, 2015 at 11:42 AM, Yuheng Du 
> wrote:
>
> > Hi,
> >
> > I am running the performance test for kafka.
> > https://gist.github.com/jkreps
> > /c7ddb4041ef62a900e6c
> >
> > For the "Three Producers, 3x async replication" scenario, the command is
> > the same as one producer:
> >
> > bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance
> > test 5000 100 -1 acks=1
> > bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092
> > buffer.memory=67108864 batch.size=8196
> >
> > So How to I run the test for three producers? Do I just run them on three
> > separate servers at the same time? Will there be some error in this way
> > since the three producers can't be started at the same time?
> >
> > Thanks.
> >
> > best,
> > Yuheng
> >
>
>
>
> --
>
> Jiefu Gong
> University of California, Berkeley | Class of 2017
> B.A Computer Science | College of Letters and Sciences
>
> jg...@berkeley.edu  | (925) 400-3427
>


How to run the three producers test

2015-07-14 Thread Yuheng Du
Hi,

I am running the performance test for kafka. https://gist.github.com/jkreps
/c7ddb4041ef62a900e6c

For the "Three Producers, 3x async replication" scenario, the command is
the same as one producer:

bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance
test 5000 100 -1 acks=1
bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092
buffer.memory=67108864 batch.size=8196

So How to I run the test for three producers? Do I just run them on three
separate servers at the same time? Will there be some error in this way
since the three producers can't be started at the same time?

Thanks.

best,
Yuheng


Latency test

2015-07-14 Thread Yuheng Du
Currently, the latency test from kafka test the end to end latency between
producers and consumers.

Is there  a way to test the producer to broker  and broker to consumer
delay seperately?

Thanks.


Re: kafka benchmark tests

2015-07-14 Thread Yuheng Du
Also, I guess setting the target throughput to -1 means let it be as high
as possible?

On Tue, Jul 14, 2015 at 10:36 AM, Yuheng Du 
wrote:

> Thanks. If I set the acks=1 in the producer config options in
> bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance
> test7 5000 100 -1 acks=1 bootstrap.servers=
> esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196?
>
> Does that mean for each message generated at the producer, the producer
> will wait until the broker sends the ack back, then send another message?
>
> Thanks.
>
> Yuheng
>
> On Tue, Jul 14, 2015 at 10:06 AM, Manikumar Reddy 
> wrote:
>
>> Yes, A list of  Kafka Server host/port pairs to use for establishing the
>> initial connection to the Kafka cluster
>>
>> https://kafka.apache.org/documentation.html#newproducerconfigs
>>
>> On Tue, Jul 14, 2015 at 7:29 PM, Yuheng Du 
>> wrote:
>>
>> > Does anyone know what is bootstrap.servers=
>> > esv4-hcl198.grid.linkedin.com:9092 means in the following test command:
>> >
>> > bin/kafka-run-class.sh
>> org.apache.kafka.clients.tools.ProducerPerformance
>> > test7 5000 100 -1 acks=1 bootstrap.servers=
>> > esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864
>> batch.size=8196?
>> >
>> > what is bootstrap.servers? Is it the kafka server that I am running a
>> test
>> > at?
>> >
>> > Thanks.
>> >
>> > Yuheng
>> >
>> >
>> >
>> >
>> > On Tue, Jul 14, 2015 at 12:18 AM, Ewen Cheslack-Postava <
>> e...@confluent.io
>> > >
>> > wrote:
>> >
>> > > I implemented (nearly) the same basic set of tests in the system test
>> > > framework we started at Confluent and that is going to move into
>> Kafka --
>> > > see the wip patch for KIP-25 here:
>> > https://github.com/apache/kafka/pull/70
>> > > In particular, that test is implemented in benchmark_test.py:
>> > >
>> > >
>> >
>> https://github.com/apache/kafka/pull/70/files#diff-ca984778cf9943407645eb6784f19dc8
>> > >
>> > > Hopefully once that's merged people can reuse that benchmark (and add
>> to
>> > > it!) so they can easily run the same benchmarks across different
>> > hardware.
>> > > Here are some results from an older version of that test on m3.2xlarge
>> > > instances on EC2 using local ephemeral storage (I think... it's been
>> > awhile
>> > > since I ran these numbers and I didn't document methodology that
>> > > carefully):
>> > >
>> > > INFO:_.KafkaBenchmark:=
>> > > INFO:_.KafkaBenchmark:BENCHMARK RESULTS
>> > > INFO:_.KafkaBenchmark:=
>> > > INFO:_.KafkaBenchmark:Single producer, no replication: 684097.470208
>> > > rec/sec (65.24 MB/s)
>> > > INFO:_.KafkaBenchmark:Single producer, async 3x replication:
>> > > 667494.359673 rec/sec (63.66 MB/s)
>> > > INFO:_.KafkaBenchmark:Single producer, sync 3x replication:
>> > > 116485.764275 rec/sec (11.11 MB/s)
>> > > INFO:_.KafkaBenchmark:Three producers, async 3x replication:
>> > > 1696519.022182 rec/sec (161.79 MB/s)
>> > > INFO:_.KafkaBenchmark:Message size:
>> > > INFO:_.KafkaBenchmark: 10: 1637825.195625 rec/sec (15.62 MB/s)
>> > > INFO:_.KafkaBenchmark: 100: 605504.877911 rec/sec (57.75 MB/s)
>> > > INFO:_.KafkaBenchmark: 1000: 90351.817570 rec/sec (86.17 MB/s)
>> > > INFO:_.KafkaBenchmark: 1: 8306.180862 rec/sec (79.21 MB/s)
>> > > INFO:_.KafkaBenchmark: 10: 978.403499 rec/sec (93.31 MB/s)
>> > > INFO:_.KafkaBenchmark:Throughput over long run, data > memory:
>> > > INFO:_.KafkaBenchmark: Time block 0: 684725.151324 rec/sec (65.30
>> > MB/s)
>> > > INFO:_.KafkaBenchmark:Single consumer: 701031.14 rec/sec
>> (56.830500
>> > > MB/s)
>> > > INFO:_.KafkaBenchmark:Three consumers: 3304011.014900 rec/sec
>> (267.830800
>> > > MB/s)
>> > > INFO:_.KafkaBenchmark:Producer + consumer:
>> > > INFO:_.KafkaBenchmark: Producer: 624984.375391 rec/sec (59.60
>> MB/s)
>> > > INFO:_.KafkaBenchmark: Consumer: 624984.375391 rec/sec (59.60
>> MB/s)
>> > > INFO:_.KafkaBenchmark:End-to-end latency: median 2.00 ms, 99%
>> > > 4.00 ms, 99.9% 19.00 ms
>> > >
&g

Re: kafka benchmark tests

2015-07-14 Thread Yuheng Du
Thanks. If I set the acks=1 in the producer config options in
bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance
test7 5000 100 -1 acks=1 bootstrap.servers=
esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196?

Does that mean for each message generated at the producer, the producer
will wait until the broker sends the ack back, then send another message?

Thanks.

Yuheng

On Tue, Jul 14, 2015 at 10:06 AM, Manikumar Reddy 
wrote:

> Yes, A list of  Kafka Server host/port pairs to use for establishing the
> initial connection to the Kafka cluster
>
> https://kafka.apache.org/documentation.html#newproducerconfigs
>
> On Tue, Jul 14, 2015 at 7:29 PM, Yuheng Du 
> wrote:
>
> > Does anyone know what is bootstrap.servers=
> > esv4-hcl198.grid.linkedin.com:9092 means in the following test command:
> >
> > bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance
> > test7 5000 100 -1 acks=1 bootstrap.servers=
> > esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864
> batch.size=8196?
> >
> > what is bootstrap.servers? Is it the kafka server that I am running a
> test
> > at?
> >
> > Thanks.
> >
> > Yuheng
> >
> >
> >
> >
> > On Tue, Jul 14, 2015 at 12:18 AM, Ewen Cheslack-Postava <
> e...@confluent.io
> > >
> > wrote:
> >
> > > I implemented (nearly) the same basic set of tests in the system test
> > > framework we started at Confluent and that is going to move into Kafka
> --
> > > see the wip patch for KIP-25 here:
> > https://github.com/apache/kafka/pull/70
> > > In particular, that test is implemented in benchmark_test.py:
> > >
> > >
> >
> https://github.com/apache/kafka/pull/70/files#diff-ca984778cf9943407645eb6784f19dc8
> > >
> > > Hopefully once that's merged people can reuse that benchmark (and add
> to
> > > it!) so they can easily run the same benchmarks across different
> > hardware.
> > > Here are some results from an older version of that test on m3.2xlarge
> > > instances on EC2 using local ephemeral storage (I think... it's been
> > awhile
> > > since I ran these numbers and I didn't document methodology that
> > > carefully):
> > >
> > > INFO:_.KafkaBenchmark:=
> > > INFO:_.KafkaBenchmark:BENCHMARK RESULTS
> > > INFO:_.KafkaBenchmark:=
> > > INFO:_.KafkaBenchmark:Single producer, no replication: 684097.470208
> > > rec/sec (65.24 MB/s)
> > > INFO:_.KafkaBenchmark:Single producer, async 3x replication:
> > > 667494.359673 rec/sec (63.66 MB/s)
> > > INFO:_.KafkaBenchmark:Single producer, sync 3x replication:
> > > 116485.764275 rec/sec (11.11 MB/s)
> > > INFO:_.KafkaBenchmark:Three producers, async 3x replication:
> > > 1696519.022182 rec/sec (161.79 MB/s)
> > > INFO:_.KafkaBenchmark:Message size:
> > > INFO:_.KafkaBenchmark: 10: 1637825.195625 rec/sec (15.62 MB/s)
> > > INFO:_.KafkaBenchmark: 100: 605504.877911 rec/sec (57.75 MB/s)
> > > INFO:_.KafkaBenchmark: 1000: 90351.817570 rec/sec (86.17 MB/s)
> > > INFO:_.KafkaBenchmark: 1: 8306.180862 rec/sec (79.21 MB/s)
> > > INFO:_.KafkaBenchmark: 10: 978.403499 rec/sec (93.31 MB/s)
> > > INFO:_.KafkaBenchmark:Throughput over long run, data > memory:
> > > INFO:_.KafkaBenchmark: Time block 0: 684725.151324 rec/sec (65.30
> > MB/s)
> > > INFO:_.KafkaBenchmark:Single consumer: 701031.14 rec/sec (56.830500
> > > MB/s)
> > > INFO:_.KafkaBenchmark:Three consumers: 3304011.014900 rec/sec
> (267.830800
> > > MB/s)
> > > INFO:_.KafkaBenchmark:Producer + consumer:
> > > INFO:_.KafkaBenchmark: Producer: 624984.375391 rec/sec (59.60 MB/s)
> > > INFO:_.KafkaBenchmark: Consumer: 624984.375391 rec/sec (59.60 MB/s)
> > > INFO:_.KafkaBenchmark:End-to-end latency: median 2.00 ms, 99%
> > > 4.00 ms, 99.9% 19.00 ms
> > >
> > > Don't trust these numbers for anything, the were a quick one-off test.
> > I'm
> > > just pasting the output so you get some idea of what the results might
> > look
> > > like. Once we merge the KIP-25 patch, Confluent will be running the
> tests
> > > regularly and results will be available publicly so we'll be able to
> keep
> > > better tabs on performance, albeit for only a specific class of
> hardware.
> > >
> > > For the batch.size question -- I'm not sure the resu

Re: kafka benchmark tests

2015-07-14 Thread Yuheng Du
Does anyone know what is bootstrap.servers=
esv4-hcl198.grid.linkedin.com:9092 means in the following test command:

bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance
test7 5000 100 -1 acks=1 bootstrap.servers=
esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196?

what is bootstrap.servers? Is it the kafka server that I am running a test
at?

Thanks.

Yuheng




On Tue, Jul 14, 2015 at 12:18 AM, Ewen Cheslack-Postava 
wrote:

> I implemented (nearly) the same basic set of tests in the system test
> framework we started at Confluent and that is going to move into Kafka --
> see the wip patch for KIP-25 here: https://github.com/apache/kafka/pull/70
> In particular, that test is implemented in benchmark_test.py:
>
> https://github.com/apache/kafka/pull/70/files#diff-ca984778cf9943407645eb6784f19dc8
>
> Hopefully once that's merged people can reuse that benchmark (and add to
> it!) so they can easily run the same benchmarks across different hardware.
> Here are some results from an older version of that test on m3.2xlarge
> instances on EC2 using local ephemeral storage (I think... it's been awhile
> since I ran these numbers and I didn't document methodology that
> carefully):
>
> INFO:_.KafkaBenchmark:=
> INFO:_.KafkaBenchmark:BENCHMARK RESULTS
> INFO:_.KafkaBenchmark:=
> INFO:_.KafkaBenchmark:Single producer, no replication: 684097.470208
> rec/sec (65.24 MB/s)
> INFO:_.KafkaBenchmark:Single producer, async 3x replication:
> 667494.359673 rec/sec (63.66 MB/s)
> INFO:_.KafkaBenchmark:Single producer, sync 3x replication:
> 116485.764275 rec/sec (11.11 MB/s)
> INFO:_.KafkaBenchmark:Three producers, async 3x replication:
> 1696519.022182 rec/sec (161.79 MB/s)
> INFO:_.KafkaBenchmark:Message size:
> INFO:_.KafkaBenchmark: 10: 1637825.195625 rec/sec (15.62 MB/s)
> INFO:_.KafkaBenchmark: 100: 605504.877911 rec/sec (57.75 MB/s)
> INFO:_.KafkaBenchmark: 1000: 90351.817570 rec/sec (86.17 MB/s)
> INFO:_.KafkaBenchmark: 1: 8306.180862 rec/sec (79.21 MB/s)
> INFO:_.KafkaBenchmark: 10: 978.403499 rec/sec (93.31 MB/s)
> INFO:_.KafkaBenchmark:Throughput over long run, data > memory:
> INFO:_.KafkaBenchmark: Time block 0: 684725.151324 rec/sec (65.30 MB/s)
> INFO:_.KafkaBenchmark:Single consumer: 701031.14 rec/sec (56.830500
> MB/s)
> INFO:_.KafkaBenchmark:Three consumers: 3304011.014900 rec/sec (267.830800
> MB/s)
> INFO:_.KafkaBenchmark:Producer + consumer:
> INFO:_.KafkaBenchmark: Producer: 624984.375391 rec/sec (59.60 MB/s)
> INFO:_.KafkaBenchmark: Consumer: 624984.375391 rec/sec (59.60 MB/s)
> INFO:_.KafkaBenchmark:End-to-end latency: median 2.00 ms, 99%
> 4.00 ms, 99.9% 19.00 ms
>
> Don't trust these numbers for anything, the were a quick one-off test. I'm
> just pasting the output so you get some idea of what the results might look
> like. Once we merge the KIP-25 patch, Confluent will be running the tests
> regularly and results will be available publicly so we'll be able to keep
> better tabs on performance, albeit for only a specific class of hardware.
>
> For the batch.size question -- I'm not sure the results in the blog post
> actually have different settings, it could be accidental divergence between
> the script and the blog post. The post specifically notes that tuning the
> batch size in the synchronous case might help, but that he didn't do that.
> If you're trying to benchmark the *optimal* throughput, tuning the batch
> size would make sense. Since synchronous replication will have higher
> latency and there's a limit to how many requests can be in flight at once,
> you'll want a larger batch size to compensate for the additional latency.
> However, in practice the increase you see may be negligible. Somebody who
> has spent more time fiddling with tweaking producer performance may have
> more insight.
>
> -Ewen
>
> On Mon, Jul 13, 2015 at 10:08 AM, JIEFU GONG  wrote:
>
> > Hi all,
> >
> > I was wondering if any of you guys have done benchmarks on Kafka
> > performance before, and if they or their details (# nodes in cluster, #
> > records / size(s) of messages, etc.) could be shared.
> >
> > For comparison purposes, I am trying to benchmark Kafka against some
> > similar services such as Kinesis or Scribe. Additionally, I was wondering
> > if anyone could shed some insight on Jay Kreps' benchmarks that he has
> > openly published here:
> >
> >
> https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines
> >
> > Specifically, I am unsure of why between his tests of 3x synchronous
> > replication and 3x async replication he changed the batch.size, as well
> as
> > why he is seemingly publishing to incorrect topics:
> >
> > Configs:
> > https://gist.github.com/jkreps/c7ddb4041ef62a900e6c
> >
> > Any help is greatly appreciated!
> >
> >
> >
> > --
> >
> > Jiefu Gong
> > University of California, Berkeley 

Re: performance benchmarking of kafka

2015-07-13 Thread Yuheng Du
Hi,

Appreciate your response. It works now! It is just a typo of the class
names : (.

It really has nothing to do with whether you are using the binaries or the
source version of kafka.

Thanks everyone!

On Mon, Jul 13, 2015 at 11:18 PM, tao xiao  wrote:

> org.apache.kafka.clients.tools.ProducerPerformance resides in
> kafka-clients-0.8.2.1.jar.
> You need to make sure the jar exists in $KAFKA_HOME/libs/. I use
> kafka_2.10-0.8.2.1
> too and here is the output
>
> % bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance
>
> USAGE: java org.apache.kafka.clients.tools.ProducerPerformance topic_name
> num_records record_size target_records_sec [prop_name=prop_value]*
>
>
>
> On Tue, 14 Jul 2015 at 05:08 Yuheng Du  wrote:
>
> > I am using the binaries of kafka_2.10-0.8.2.1. Could that be the problem?
> > Should I use the source of kafka-0.8.2.1-src.tgz to each of my machiines,
> > build them and run the test?
> > Thanks.
> >
> > On Mon, Jul 13, 2015 at 4:37 PM, JIEFU GONG  wrote:
> >
> > > You may need to open up your run-class.sh in a text editor and modify
> the
> > > classpath -- I believe I had a similar error before.
> > >
> > > On Mon, Jul 13, 2015 at 1:16 PM, Yuheng Du 
> > > wrote:
> > >
> > > > Hi guys,
> > > >
> > > > I am trying to replicate the test of benchmarking kafka at
> > > >
> > > >
> > >
> >
> http://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines
> > > > .
> > > >
> > > > When I run
> > > >
> > > > bin/kafka-run-class.sh
> > org.apache.kafka.clients.tools.ProducerPerformance
> > > > test7 5000 100 -1 acks=1 bootstrap.servers=192.168.1.1:9092
> > > > buffer.memory=67108864 batch.size=8196
> > > >
> > > > and I got the following error:
> > > > Error: Could not find or load main class
> > > > org.apache.kafka.client.tools.ProducerPerformance
> > > >
> > > > What should I fix? Thank you!
> > > >
> > >
> > >
> > >
> > > --
> > >
> > > Jiefu Gong
> > > University of California, Berkeley | Class of 2017
> > > B.A Computer Science | College of Letters and Sciences
> > >
> > > jg...@berkeley.edu  | (925) 400-3427
> > >
> >
>


Re: performance benchmarking of kafka

2015-07-13 Thread Yuheng Du
I am using the binaries of kafka_2.10-0.8.2.1. Could that be the problem?
Should I use the source of kafka-0.8.2.1-src.tgz to each of my machiines,
build them and run the test?
Thanks.

On Mon, Jul 13, 2015 at 4:37 PM, JIEFU GONG  wrote:

> You may need to open up your run-class.sh in a text editor and modify the
> classpath -- I believe I had a similar error before.
>
> On Mon, Jul 13, 2015 at 1:16 PM, Yuheng Du 
> wrote:
>
> > Hi guys,
> >
> > I am trying to replicate the test of benchmarking kafka at
> >
> >
> http://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines
> > .
> >
> > When I run
> >
> > bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance
> > test7 5000 100 -1 acks=1 bootstrap.servers=192.168.1.1:9092
> > buffer.memory=67108864 batch.size=8196
> >
> > and I got the following error:
> > Error: Could not find or load main class
> > org.apache.kafka.client.tools.ProducerPerformance
> >
> > What should I fix? Thank you!
> >
>
>
>
> --
>
> Jiefu Gong
> University of California, Berkeley | Class of 2017
> B.A Computer Science | College of Letters and Sciences
>
> jg...@berkeley.edu  | (925) 400-3427
>


Re: performance benchmarking of kafka

2015-07-13 Thread Yuheng Du
Thank you. I see that in run-class.sh, they have the following lines:

 63 for file in $base_dir/clients/build/libs/kafka-clients*.jar;

 64 do

 65   CLASSPATH=$CLASSPATH:$file

 66 done

So I believe all the jars in the libs/ directory have already been included
in the classpath?

Which directory is the ProducerPerformance class resides?

Thanks.

On Mon, Jul 13, 2015 at 4:37 PM, JIEFU GONG  wrote:

> You may need to open up your run-class.sh in a text editor and modify the
> classpath -- I believe I had a similar error before.
>
> On Mon, Jul 13, 2015 at 1:16 PM, Yuheng Du 
> wrote:
>
> > Hi guys,
> >
> > I am trying to replicate the test of benchmarking kafka at
> >
> >
> http://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines
> > .
> >
> > When I run
> >
> > bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance
> > test7 5000 100 -1 acks=1 bootstrap.servers=192.168.1.1:9092
> > buffer.memory=67108864 batch.size=8196
> >
> > and I got the following error:
> > Error: Could not find or load main class
> > org.apache.kafka.client.tools.ProducerPerformance
> >
> > What should I fix? Thank you!
> >
>
>
>
> --
>
> Jiefu Gong
> University of California, Berkeley | Class of 2017
> B.A Computer Science | College of Letters and Sciences
>
> jg...@berkeley.edu  | (925) 400-3427
>


performance benchmarking of kafka

2015-07-13 Thread Yuheng Du
Hi guys,

I am trying to replicate the test of benchmarking kafka at
http://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines
.

When I run

bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance
test7 5000 100 -1 acks=1 bootstrap.servers=192.168.1.1:9092
buffer.memory=67108864 batch.size=8196

and I got the following error:
Error: Could not find or load main class
org.apache.kafka.client.tools.ProducerPerformance

What should I fix? Thank you!


Re: A kafka web monitor

2015-03-27 Thread Yuheng Du
Hi Wan,

I tried to install this DCMonitor, but when I try to clone the project, but
it gives me "Permission denied, the remote end hung up unexpectedly". Can
you provide any suggestions to this issue?

Thanks.

best,
Yuheng

On Mon, Mar 23, 2015 at 8:54 AM, Wan Wei  wrote:

> We have make a simple web console to monitor some kafka informations like
> consumer offset, logsize.
>
> https://github.com/shunfei/DCMonitor
>
>
> Hope you like it and offer your help to make it better :)
> Regards
> Flow
>


Re: kafka topic information

2015-03-09 Thread Yuheng Du
Thanks, got it!

best,
Yuheng

On Mon, Mar 9, 2015 at 11:52 AM, Harsha  wrote:

> In general users are expected to run zookeeper cluster of 3 or 5 nodes.
> Zookeeper requires quorum of servers running which means at least ceil(n/2)
> servers need to be up. For 3 zookeeper nodes there needs to be atleast 2 zk
> nodes up at any time , i.e your cluster can function  fine incase of 1
> machine failure and incase of 5 there should be at least 3 nodes to be up
> and running.  For more info on zookeeper you can look under here
> http://zookeeper.apache.org/doc/r3.4.6/
> http://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html
>
>
> --
> Harsha
>
> On March 9, 2015 at 8:39:00 AM, Yuheng Du (yuheng.du.h...@gmail.com)
> wrote:
>
> Harsha,
>
> Thanks for reply. So what if the zookeeper cluster fails? Will the topics
> information be lost? What fault-tolerant mechanism does zookeeper offer?
>
> best,
>
> On Mon, Mar 9, 2015 at 11:36 AM, Harsha  wrote:
>
>>  Yuheng,
>>kafka keeps cluster metadata in zookeeper along with topic
>> metadata as well. You can use zookeeper-shell.sh or zkCli.sh to check zk
>> nodes, /brokers/topics will give you the list of topics .
>>
>>  --
>> Harsha
>>
>>
>> On March 9, 2015 at 8:20:59 AM, Yuheng Du (yuheng.du.h...@gmail.com)
>> wrote:
>>
>>  I am wondering where does kafka cluster keep the topic metadata (name,
>> partition, replication, etc)? How does a server recover the topic's
>> metadata and messages after restart and what data will be lost?
>>
>> Thanks for anyone to answer my questions.
>>
>> best,
>> Yuheng
>>
>>
>


Re: kafka topic information

2015-03-09 Thread Yuheng Du
Harsha,

Thanks for reply. So what if the zookeeper cluster fails? Will the topics
information be lost? What fault-tolerant mechanism does zookeeper offer?

best,

On Mon, Mar 9, 2015 at 11:36 AM, Harsha  wrote:

> Yuheng,
>   kafka keeps cluster metadata in zookeeper along with topic
> metadata as well. You can use zookeeper-shell.sh or zkCli.sh to check zk
> nodes, /brokers/topics will give you the list of topics .
>
> --
> Harsha
>
>
> On March 9, 2015 at 8:20:59 AM, Yuheng Du (yuheng.du.h...@gmail.com)
> wrote:
>
> I am wondering where does kafka cluster keep the topic metadata (name,
> partition, replication, etc)? How does a server recover the topic's
> metadata and messages after restart and what data will be lost?
>
> Thanks for anyone to answer my questions.
>
> best,
> Yuheng
>
>


kafka topic information

2015-03-09 Thread Yuheng Du
I am wondering where does kafka cluster keep the topic metadata (name,
partition, replication, etc)? How does a server recover  the topic's
metadata and messages after restart and what data will be lost?

Thanks for anyone to answer my questions.

best,
Yuheng


Re: Set up kafka cluster

2015-03-05 Thread Yuheng Du
will do, Thanks!

On Thu, Mar 5, 2015 at 3:35 PM, Gwen Shapira  wrote:

> Did you take a look at the quick-start guide?
> https://kafka.apache.org/082/quickstart.html
>
> It shows how to set up a single node, how to validate that its working
> and then how to set up multi-node cluster.
>
> Good luck!
>
> On Thu, Mar 5, 2015 at 12:30 PM, Yuheng Du 
> wrote:
> > Thank you Gwen,
> >
> > I also need the kafka cluster continue to provide message brokering
> service
> > to a Storm cluster after the benchmarking. I am fairly new to cluster
> > setups. So is there an instruction telling me how to set up the
> three-node
> > kafka cluster before running benchmarking? That would be really helpful.
> >
> > Thanks gain.
> >
> > best,
> > Yuheng
> >
> > On Thu, Mar 5, 2015 at 3:22 PM, Gwen Shapira 
> wrote:
> >
> >> Jay Kreps has a gist with step by step instructions for reproducing
> >> the benchmarks used by LinkedIn:
> >> https://gist.github.com/jkreps/c7ddb4041ef62a900e6c
> >>
> >> And the blog with the results:
> >>
> >>
> https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines
> >>
> >> Gwen
> >>
> >> On Thu, Mar 5, 2015 at 12:16 PM, Yuheng Du 
> >> wrote:
> >> > Hi everyone,
> >> >
> >> > I am trying to set up a kafka cluster consisting of three machines. I
> >> wanna
> >> > run a benchmarking program in them. Can anyone recommend a step by
> step
> >> > tutorial/instruction of how I can do it?
> >> >
> >> > Thanks.
> >> >
> >> > best,
> >> > Yuheng
> >>
>


Re: Set up kafka cluster

2015-03-05 Thread Yuheng Du
Thank you Gwen,

I also need the kafka cluster continue to provide message brokering service
to a Storm cluster after the benchmarking. I am fairly new to cluster
setups. So is there an instruction telling me how to set up the three-node
kafka cluster before running benchmarking? That would be really helpful.

Thanks gain.

best,
Yuheng

On Thu, Mar 5, 2015 at 3:22 PM, Gwen Shapira  wrote:

> Jay Kreps has a gist with step by step instructions for reproducing
> the benchmarks used by LinkedIn:
> https://gist.github.com/jkreps/c7ddb4041ef62a900e6c
>
> And the blog with the results:
>
> https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines
>
> Gwen
>
> On Thu, Mar 5, 2015 at 12:16 PM, Yuheng Du 
> wrote:
> > Hi everyone,
> >
> > I am trying to set up a kafka cluster consisting of three machines. I
> wanna
> > run a benchmarking program in them. Can anyone recommend a step by step
> > tutorial/instruction of how I can do it?
> >
> > Thanks.
> >
> > best,
> > Yuheng
>


Set up kafka cluster

2015-03-05 Thread Yuheng Du
Hi everyone,

I am trying to set up a kafka cluster consisting of three machines. I wanna
run a benchmarking program in them. Can anyone recommend a step by step
tutorial/instruction of how I can do it?

Thanks.

best,
Yuheng