Re: connection time out
Also, I can see the topic "speedx2" being created in the broker, but not message data is coming through. On Sun, Nov 29, 2015 at 7:00 PM, Yuheng Du wrote: > Hi guys, > > I was running a single node broker in a cluster. And when I run the > producer in another cluster, I got connection time out error. > > I can ping into port 9092 and other ports on the broker machine from the > producer. I just can't publish any messages. The command I used to run the > producer is: > > bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance > speedx2 50 100 -1 acks=1 bootstrap.servers=130.127.xxx.xxx:9092 > buffer.memory=67184 batch.size=8196 > > Can anyone suggest what the problem might be? > > > Thank you! > > > best, > > Yuheng >
connection time out
Hi guys, I was running a single node broker in a cluster. And when I run the producer in another cluster, I got connection time out error. I can ping into port 9092 and other ports on the broker machine from the producer. I just can't publish any messages. The command I used to run the producer is: bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance speedx2 50 100 -1 acks=1 bootstrap.servers=130.127.xxx.xxx:9092 buffer.memory=67184 batch.size=8196 Can anyone suggest what the problem might be? Thank you! best, Yuheng
Re: producer api
Thank you Erik. But in my setup, there is only one node whose public ip provided in my broker cluster, so I can only use one bootstrap broker as for now. On Mon, Sep 14, 2015 at 3:50 PM, Helleren, Erik wrote: > You only need one of the brokers to connect for publishing. Kafka will > tell the client about all the other brokers. But best practices state > including all of them is best. > -Erik > > On 9/14/15, 2:46 PM, "Yuheng Du" wrote: > > >I am writing a kafka producer application in java. I want the producer to > >publish data to a cluster of 6 brokers. Is there a way to specify only the > >load balancing node but not all the brokers list? > > > >For example, like in the benchmarking kafka commandssdg: > > > >bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance > >test 5000 100 -1 acks=-1 bootstrap.servers= > >esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 > batch.size=64000 > > > >it only specifies the bootstrap server node but not all the broker list > >like: > > > >Properties props = new Properties(); > > > >props.put("metadata.broker.list", "broker1:9092,broker2:9092"); > > > > > > > >Thanks for replying. > >
producer api
I am writing a kafka producer application in java. I want the producer to publish data to a cluster of 6 brokers. Is there a way to specify only the load balancing node but not all the brokers list? For example, like in the benchmarking kafka commandssdg: bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test 5000 100 -1 acks=-1 bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=64000 it only specifies the bootstrap server node but not all the broker list like: Properties props = new Properties(); props.put("metadata.broker.list", "broker1:9092,broker2:9092"); Thanks for replying.
Re: latency test
Thank you Erik. In my test I am using fixed 200bytes messages and I run 500k messages per producer on 92 physically isolated producers. Each test run takes about 20 minutes. As the broker cluster is migrated into a new physical cluster, I will perform my test and get the latency results in the next couple of weeks. I will keep you posted. Thanks. On Wed, Sep 9, 2015 at 4:58 PM, Helleren, Erik wrote: > Yes, and that can really hurt average performance. All the partitions > were nearly identical up to the 99%’ile, and had very good performance at > that level hovering around a few milli’s. But when looking beyond the > 99%’ile, there was that clear fork in the distribution where a set of 3 > partitions surged upwards. This could be for a dozen different reasons: > Network blips, noisy networks, location in the network, resource > contention on that broker, etc. But it effected that one broker more than > others. And the reasons for my cluster displaying this behavior could be > very different than the reason for any other cluster. > > Its worth noting that this was mostly a latency test over a stress test. > There was a single kafka producer object, very small message sizes (100 > bytes), and it was only pushing through around 5MB/s worth of data. And > the client was configured to minimize the amount of data that would be on > the internal queue/buffer waiting to be sent. The messages that were > being sent were compromised of 10 byte ascii ‘words’ selected randomly > from a dictionary of 1000 words, which benefits compression while still > resulting in likely unique messages. And the test I ran was only for 6 > min, and I did not do the work required to see if there was a burst of > slower messages which caused this behavior, or if it was a consistent > issue with that node. > -Erik > > > On 9/9/15, 2:24 PM, "Yuheng Du" wrote: > > >So are you suggesting that the long delays happened in %1 percentile > >happens in the slower partitions that are further away? Thanks. > > > >On Wed, Sep 9, 2015 at 3:15 PM, Helleren, Erik > > > >wrote: > > > >> So, I did my own latency test on a cluster of 3 nodes, and there is a > >> significant difference around the 99%’ile and higher for partitions when > >> measuring the the ack time when configured for a single ack. The graph > >> that I wish I could attach or post clearly shows that around 1/3 of the > >> partitions significantly diverge from the other two. So, at least in my > >> case, one of my brokers is further than the others. > >> -Erik > >> > >> On 9/4/15, 1:06 PM, "Yuheng Du" wrote: > >> > >> >No problem. Thanks for your advice. I think it would be fun to > >>explore. I > >> >only know how to program in java though. Hope it will work. > >> > > >> >On Fri, Sep 4, 2015 at 2:03 PM, Helleren, Erik > >> > > >> >wrote: > >> > > >> >> I thing the suggestion is to have partitions/brokers >=1, so 32 > >>should > >> >>be > >> >> enough. > >> >> > >> >> As for latency tests, there isn’t a lot of code to do a latency test. > >> >>If > >> >> you just want to measure ack time its around 100 lines. I will try > >>to > >> >> push out some good latency testing code to github, but my company is > >> >> scared of open sourcing code… so it might be a while… > >> >> -Erik > >> >> > >> >> > >> >> On 9/4/15, 12:55 PM, "Yuheng Du" wrote: > >> >> > >> >> >Thanks for your reply Erik. I am running some more tests according > >>to > >> >>your > >> >> >suggestions now and I will share with my results here. Is it > >>necessary > >> >>to > >> >> >use a fixed number of partitions (32 partitions maybe) for my test? > >> >> > > >> >> >I am testing 2, 4, 8, 16 and 32 brokers scenarios, all of them are > >> >>running > >> >> >on individual physical nodes. So I think using at least 32 > >>partitions > >> >>will > >> >> >make more sense? I have seen latencies increase as the number of > >> >> >partitions > >> >> >goes up in my experiments. > >> >> > > >> >> >To get the latency of each event data recorded, are you suggesting > >> >>that I > >> >> >rewrite my own test program (in Java perhaps) or I can just modify
Re: latency test
So are you suggesting that the long delays happened in %1 percentile happens in the slower partitions that are further away? Thanks. On Wed, Sep 9, 2015 at 3:15 PM, Helleren, Erik wrote: > So, I did my own latency test on a cluster of 3 nodes, and there is a > significant difference around the 99%’ile and higher for partitions when > measuring the the ack time when configured for a single ack. The graph > that I wish I could attach or post clearly shows that around 1/3 of the > partitions significantly diverge from the other two. So, at least in my > case, one of my brokers is further than the others. > -Erik > > On 9/4/15, 1:06 PM, "Yuheng Du" wrote: > > >No problem. Thanks for your advice. I think it would be fun to explore. I > >only know how to program in java though. Hope it will work. > > > >On Fri, Sep 4, 2015 at 2:03 PM, Helleren, Erik > > > >wrote: > > > >> I thing the suggestion is to have partitions/brokers >=1, so 32 should > >>be > >> enough. > >> > >> As for latency tests, there isn’t a lot of code to do a latency test. > >>If > >> you just want to measure ack time its around 100 lines. I will try to > >> push out some good latency testing code to github, but my company is > >> scared of open sourcing code… so it might be a while… > >> -Erik > >> > >> > >> On 9/4/15, 12:55 PM, "Yuheng Du" wrote: > >> > >> >Thanks for your reply Erik. I am running some more tests according to > >>your > >> >suggestions now and I will share with my results here. Is it necessary > >>to > >> >use a fixed number of partitions (32 partitions maybe) for my test? > >> > > >> >I am testing 2, 4, 8, 16 and 32 brokers scenarios, all of them are > >>running > >> >on individual physical nodes. So I think using at least 32 partitions > >>will > >> >make more sense? I have seen latencies increase as the number of > >> >partitions > >> >goes up in my experiments. > >> > > >> >To get the latency of each event data recorded, are you suggesting > >>that I > >> >rewrite my own test program (in Java perhaps) or I can just modify the > >> >standard test program provided by kafka ( > >> >https://gist.github.com/jkreps/c7ddb4041ef62a900e6c )? I guess I need > >>to > >> >rebuild the source if I modify the standard java test program > >> >ProducerPerformance provided in kafka, right? Now this standard program > >> >only has average latencies and percentile latencies but no per event > >> >latencies. > >> > > >> >Thanks. > >> > > >> >On Fri, Sep 4, 2015 at 1:42 PM, Helleren, Erik > >> > > >> >wrote: > >> > > >> >> That is an excellent question! There are a bunch of ways to monitor > >> >> jitter and see when that is happening. Here are a few: > >> >> > >> >> - You could slice the histogram every few seconds, save it out with a > >> >> timestamp, and then look at how they compare. This would be mostly > >> >> manual, or you can graph line charts of the percentiles over time in > >> >>excel > >> >> where each percentile would be a series. If you are using HDR > >> >>Histogram, > >> >> you should look at how to use the Recorder class to do this coupled > >> >>with a > >> >> ScheduledExecutorService. > >> >> > >> >> - You can just save the starting timestamp of the event and the > >>latency > >> >>of > >> >> each event. If you put it into a CSV, you can just load it up into > >> >>excel > >> >> and graph as a XY chart. That way you can see every point during the > >> >> running of your program and you can see trends. You want to be > >>careful > >> >> about this one, especially of writing to a file in the callback that > >> >>kfaka > >> >> provides. > >> >> > >> >> Also, I have noticed that most of the very slow observations are at > >> >> startup. But don’t trust me, trust the data and share your findings. > >> >> Also, having a 99.9 percentile provides a pretty good standard for > >> >>typical > >> >> poor case performance. Average is borderline useless, 50%’ile is a > >> >>better > >> >> typica
Re: latency test
No problem. Thanks for your advice. I think it would be fun to explore. I only know how to program in java though. Hope it will work. On Fri, Sep 4, 2015 at 2:03 PM, Helleren, Erik wrote: > I thing the suggestion is to have partitions/brokers >=1, so 32 should be > enough. > > As for latency tests, there isn’t a lot of code to do a latency test. If > you just want to measure ack time its around 100 lines. I will try to > push out some good latency testing code to github, but my company is > scared of open sourcing code… so it might be a while… > -Erik > > > On 9/4/15, 12:55 PM, "Yuheng Du" wrote: > > >Thanks for your reply Erik. I am running some more tests according to your > >suggestions now and I will share with my results here. Is it necessary to > >use a fixed number of partitions (32 partitions maybe) for my test? > > > >I am testing 2, 4, 8, 16 and 32 brokers scenarios, all of them are running > >on individual physical nodes. So I think using at least 32 partitions will > >make more sense? I have seen latencies increase as the number of > >partitions > >goes up in my experiments. > > > >To get the latency of each event data recorded, are you suggesting that I > >rewrite my own test program (in Java perhaps) or I can just modify the > >standard test program provided by kafka ( > >https://gist.github.com/jkreps/c7ddb4041ef62a900e6c )? I guess I need to > >rebuild the source if I modify the standard java test program > >ProducerPerformance provided in kafka, right? Now this standard program > >only has average latencies and percentile latencies but no per event > >latencies. > > > >Thanks. > > > >On Fri, Sep 4, 2015 at 1:42 PM, Helleren, Erik > > > >wrote: > > > >> That is an excellent question! There are a bunch of ways to monitor > >> jitter and see when that is happening. Here are a few: > >> > >> - You could slice the histogram every few seconds, save it out with a > >> timestamp, and then look at how they compare. This would be mostly > >> manual, or you can graph line charts of the percentiles over time in > >>excel > >> where each percentile would be a series. If you are using HDR > >>Histogram, > >> you should look at how to use the Recorder class to do this coupled > >>with a > >> ScheduledExecutorService. > >> > >> - You can just save the starting timestamp of the event and the latency > >>of > >> each event. If you put it into a CSV, you can just load it up into > >>excel > >> and graph as a XY chart. That way you can see every point during the > >> running of your program and you can see trends. You want to be careful > >> about this one, especially of writing to a file in the callback that > >>kfaka > >> provides. > >> > >> Also, I have noticed that most of the very slow observations are at > >> startup. But don’t trust me, trust the data and share your findings. > >> Also, having a 99.9 percentile provides a pretty good standard for > >>typical > >> poor case performance. Average is borderline useless, 50%’ile is a > >>better > >> typical case because that’s the number that says “half of events will be > >> this slow or faster”, or for values that are high like 99.9%’ile, “0.1% > >>of > >> all events will be slower than this”. > >> -Erik > >> > >> On 9/4/15, 12:05 PM, "Yuheng Du" wrote: > >> > >> >Thank you Erik! That's is helpful! > >> > > >> >But also I see jitters of the maximum latencies when running the > >> >experiment. > >> > > >> >The average end to acknowledgement latency from producer to broker is > >> >around 5ms when using 92 producers and 4 brokers, and the 99.9 > >>percentile > >> >latency is 58ms, but the maximum latency goes up to 1359 ms. How to > >>locate > >> >the source of this jitter? > >> > > >> >Thanks. > >> > > >> >On Fri, Sep 4, 2015 at 10:54 AM, Helleren, Erik > >> > > >> >wrote: > >> > > >> >> WellŠ not to be contrarian, but latency depends much more on the > >>latency > >> >> between the producer and the broker that is the leader for the > >>partition > >> >> you are publishing to. At least when your brokers are not saturated > >> >>with > >> >> messages, and acks are set to 1. If acks are set to ALL, latency
Re: latency test
Thanks for your reply Erik. I am running some more tests according to your suggestions now and I will share with my results here. Is it necessary to use a fixed number of partitions (32 partitions maybe) for my test? I am testing 2, 4, 8, 16 and 32 brokers scenarios, all of them are running on individual physical nodes. So I think using at least 32 partitions will make more sense? I have seen latencies increase as the number of partitions goes up in my experiments. To get the latency of each event data recorded, are you suggesting that I rewrite my own test program (in Java perhaps) or I can just modify the standard test program provided by kafka ( https://gist.github.com/jkreps/c7ddb4041ef62a900e6c )? I guess I need to rebuild the source if I modify the standard java test program ProducerPerformance provided in kafka, right? Now this standard program only has average latencies and percentile latencies but no per event latencies. Thanks. On Fri, Sep 4, 2015 at 1:42 PM, Helleren, Erik wrote: > That is an excellent question! There are a bunch of ways to monitor > jitter and see when that is happening. Here are a few: > > - You could slice the histogram every few seconds, save it out with a > timestamp, and then look at how they compare. This would be mostly > manual, or you can graph line charts of the percentiles over time in excel > where each percentile would be a series. If you are using HDR Histogram, > you should look at how to use the Recorder class to do this coupled with a > ScheduledExecutorService. > > - You can just save the starting timestamp of the event and the latency of > each event. If you put it into a CSV, you can just load it up into excel > and graph as a XY chart. That way you can see every point during the > running of your program and you can see trends. You want to be careful > about this one, especially of writing to a file in the callback that kfaka > provides. > > Also, I have noticed that most of the very slow observations are at > startup. But don’t trust me, trust the data and share your findings. > Also, having a 99.9 percentile provides a pretty good standard for typical > poor case performance. Average is borderline useless, 50%’ile is a better > typical case because that’s the number that says “half of events will be > this slow or faster”, or for values that are high like 99.9%’ile, “0.1% of > all events will be slower than this”. > -Erik > > On 9/4/15, 12:05 PM, "Yuheng Du" wrote: > > >Thank you Erik! That's is helpful! > > > >But also I see jitters of the maximum latencies when running the > >experiment. > > > >The average end to acknowledgement latency from producer to broker is > >around 5ms when using 92 producers and 4 brokers, and the 99.9 percentile > >latency is 58ms, but the maximum latency goes up to 1359 ms. How to locate > >the source of this jitter? > > > >Thanks. > > > >On Fri, Sep 4, 2015 at 10:54 AM, Helleren, Erik > > > >wrote: > > > >> WellŠ not to be contrarian, but latency depends much more on the latency > >> between the producer and the broker that is the leader for the partition > >> you are publishing to. At least when your brokers are not saturated > >>with > >> messages, and acks are set to 1. If acks are set to ALL, latency on an > >> non-saturated kafka cluster will be: Round Trip Latency from producer to > >> leader for partition + Max( slowest Round Trip Latency to a replicas of > >> that partition). If a cluster is saturated with messages, we have to > >> assume that all partitions receive an equal distribution of messages to > >> avoid linear algebra and queueing theory models. I don¹t like linear > >> algebra :P > >> > >> Since you are probably putting all your latencies into a single > >>histogram > >> per producer, or worse, just an average, this pattern would have been > >> obscured. Obligatory lecture about measuring latency by Gil Tene > >> (https://www.youtube.com/watch?v=9MKY4KypBzg). To verify this > >>hypothesis, > >> you should re-write the benchmark to plot the latencies for each write > >>to > >> a partition for each producer into a histogram. (HRD histogram is pretty > >> good for that). This would give you producers*partitions histograms, > >> which might be unwieldy for that many producers. But wait, there is > >>hope! > >> > >> To verify that this hypothesis holds, you just have to see that there > >>is a > >> significant difference between different partitions on a SINGLE > >>producing > >> client. So, pick one producing client at random an
Re: latency test
Thank you Erik! That's is helpful! But also I see jitters of the maximum latencies when running the experiment. The average end to acknowledgement latency from producer to broker is around 5ms when using 92 producers and 4 brokers, and the 99.9 percentile latency is 58ms, but the maximum latency goes up to 1359 ms. How to locate the source of this jitter? Thanks. On Fri, Sep 4, 2015 at 10:54 AM, Helleren, Erik wrote: > WellŠ not to be contrarian, but latency depends much more on the latency > between the producer and the broker that is the leader for the partition > you are publishing to. At least when your brokers are not saturated with > messages, and acks are set to 1. If acks are set to ALL, latency on an > non-saturated kafka cluster will be: Round Trip Latency from producer to > leader for partition + Max( slowest Round Trip Latency to a replicas of > that partition). If a cluster is saturated with messages, we have to > assume that all partitions receive an equal distribution of messages to > avoid linear algebra and queueing theory models. I don¹t like linear > algebra :P > > Since you are probably putting all your latencies into a single histogram > per producer, or worse, just an average, this pattern would have been > obscured. Obligatory lecture about measuring latency by Gil Tene > (https://www.youtube.com/watch?v=9MKY4KypBzg). To verify this hypothesis, > you should re-write the benchmark to plot the latencies for each write to > a partition for each producer into a histogram. (HRD histogram is pretty > good for that). This would give you producers*partitions histograms, > which might be unwieldy for that many producers. But wait, there is hope! > > To verify that this hypothesis holds, you just have to see that there is a > significant difference between different partitions on a SINGLE producing > client. So, pick one producing client at random and use the data from > that. The easy way to do that is just plot all the partition latency > histograms on top of each other in the same plot, that way you have a > pretty plot to show people. If you don¹t want to setup plotting, you can > just compare the medians (50¹th percentile) of the partitions¹ histograms. > If there is a lot of variance, your latency anomaly is explained by > brokers 4-7 being slower than nodes 0-3! If there isn¹t a lot of variance > at 50%, look at higher percentiles. And if higher percentiles for all the > partitions look the same, this hypothesis is disproved. > > If you want to make a general statement about latency of writing to kafka, > you can merge all the histograms into a single histogram and plot that. > > To Yuheng¹s credit, more brokers always results in more throughput. But > throughput and latency are two different creatures. Its worth noting that > kafka is designed to be high throughput first and low latency second. And > it does a really good job at both. > > Disclaimer: I might not like linear algebra, but I do like statistics. > Let me know if there are topics that need more explanation above that > aren¹t covered by Gil¹s lecture. > -Erik > > On 9/4/15, 9:03 AM, "Yuheng Du" wrote: > > >When I using 32 partitions, the 4 brokers latency becomes larger than the > >8 > >brokers latency. > > > >So is it always true that using more brokers can give less latency when > >the > >number of partitions is at least the size of the brokers? > > > >Thanks. > > > >On Thu, Sep 3, 2015 at 10:45 PM, Yuheng Du > >wrote: > > > >> I am running a producer latency test. When using 92 producers in 92 > >> physical node publishing to 4 brokers, the latency is slightly lower > >>than > >> using 8 brokers, I am using 8 partitions for the topic. > >> > >> I have rerun the test and it gives me the same result, the 4 brokers > >> scenario still has lower latency than the 8 brokers scenarios. > >> > >> It is weird because I tested 1broker, 2 brokers, 4 brokers, 8 brokers, > >>16 > >> brokers and 32 brokers. For the rest of the case the latency decreases > >>as > >> the number of brokers increase. > >> > >> 4 brokers/8 brokers is the only pair that doesn't satisfy this rule. > >>What > >> could be the cause? > >> > >> I am using a 200 bytes message, the test let each producer publishes > >>500k > >> messages to a given topic. Every test run when I change the number of > >> brokers, I use a new topic. > >> > >> Thanks for any advices. > >> > >
Re: When a message is exposed to the consumer
Can't read it. Sorry On Fri, Sep 4, 2015 at 12:08 PM, Roman Shramkov wrote: > Её ай н Анны уйг > > sent from a mobile device, please excuse brevity and typos > > > ----Пользователь Yuheng Du написал > > According to the section 3.1 of the paper "Kafka: a Distributed Messaging > System for Log Processing": > > "a message is only exposed to the consumers after it is flushed"? > > Is it still true in the current kafka? like the message can only be > available after it is flushed to disk? > > Thanks. >
Re: latency test
When I using 32 partitions, the 4 brokers latency becomes larger than the 8 brokers latency. So is it always true that using more brokers can give less latency when the number of partitions is at least the size of the brokers? Thanks. On Thu, Sep 3, 2015 at 10:45 PM, Yuheng Du wrote: > I am running a producer latency test. When using 92 producers in 92 > physical node publishing to 4 brokers, the latency is slightly lower than > using 8 brokers, I am using 8 partitions for the topic. > > I have rerun the test and it gives me the same result, the 4 brokers > scenario still has lower latency than the 8 brokers scenarios. > > It is weird because I tested 1broker, 2 brokers, 4 brokers, 8 brokers, 16 > brokers and 32 brokers. For the rest of the case the latency decreases as > the number of brokers increase. > > 4 brokers/8 brokers is the only pair that doesn't satisfy this rule. What > could be the cause? > > I am using a 200 bytes message, the test let each producer publishes 500k > messages to a given topic. Every test run when I change the number of > brokers, I use a new topic. > > Thanks for any advices. >
When a message is exposed to the consumer
According to the section 3.1 of the paper "Kafka: a Distributed Messaging System for Log Processing": "a message is only exposed to the consumers after it is flushed"? Is it still true in the current kafka? like the message can only be available after it is flushed to disk? Thanks.
latency test
I am running a producer latency test. When using 92 producers in 92 physical node publishing to 4 brokers, the latency is slightly lower than using 8 brokers, I am using 8 partitions for the topic. I have rerun the test and it gives me the same result, the 4 brokers scenario still has lower latency than the 8 brokers scenarios. It is weird because I tested 1broker, 2 brokers, 4 brokers, 8 brokers, 16 brokers and 32 brokers. For the rest of the case the latency decreases as the number of brokers increase. 4 brokers/8 brokers is the only pair that doesn't satisfy this rule. What could be the cause? I am using a 200 bytes message, the test let each producer publishes 500k messages to a given topic. Every test run when I change the number of brokers, I use a new topic. Thanks for any advices.
Re: Reduce latency
I see. So the internal queue overwrites the producer buffer size configuration? When buffer is full the producer will block sending, right? On Tue, Aug 18, 2015 at 3:52 PM, Tao Feng wrote: > From what I understand, if you set the throughput to -1, the > producerperformance will push records as much as possible to an internal > per topic per partition queue. In the background there is a sender IO > thread handling the actual record sending process. If you push record to > the queue faster than the send rate, your queue will become longer and > longer, eventually record latency will become meaningless for a > latency-purpose test. > > > On Tue, Aug 18, 2015 at 11:48 AM, Yuheng Du > wrote: > > > I see. Thank you Tao. But now I don't get it what Jay said that my > latency > > test only makes sense if I set a fixed throughput. Why do I need to set a > > fixed throughput for my test instead of just set the expected throughput > to > > be -1 (as much as possible)? > > > > Thanks. > > > > On Tue, Aug 18, 2015 at 2:43 PM, Tao Feng wrote: > > > > > Hi Yuheng, > > > > > > The 1 record/s is just a param for producerperformance for your > > > producer target tput. It only takes effect to do the throttling if you > > > tries to send more than 1 record/s. The actual tput of the test > > > depends on your producer config and your setup. > > > > > > -Tao > > > > > > On Tue, Aug 18, 2015 at 11:34 AM, Yuheng Du > > > wrote: > > > > > > > Also, When I set the target throughput to be 1 records/s, The > > actual > > > > test results show I got an average of 579.86 records per second among > > all > > > > my producers. How did that happen? Why this number is not 1 then? > > > > Thanks. > > > > > > > > On Tue, Aug 18, 2015 at 10:03 AM, Yuheng Du < > yuheng.du.h...@gmail.com> > > > > wrote: > > > > > > > > > Thank you Jay, that really helps! > > > > > > > > > > Kishore, Where you can monitor whether the network is busy on IO in > > > > visual > > > > > vm? Thanks. I am running 90 producer process on 90 physical > machines > > in > > > > the > > > > > experiment. > > > > > > > > > > On Tue, Aug 18, 2015 at 1:19 AM, Jay Kreps > wrote: > > > > > > > > > >> Yuheng, > > > > >> > > > > >> From the command you gave it looks like you are configuring the > perf > > > > test > > > > >> to send data as fast as possible (the -1 for target throughput). > > This > > > > >> means > > > > >> it will always queue up a bunch of unsent data until the buffer is > > > > >> exhausted and then block. The larger the buffer, the bigger the > > queue. > > > > >> This > > > > >> is where the latency comes from. This is exactly what you would > > expect > > > > and > > > > >> what the buffering is supposed to do. > > > > >> > > > > >> If you want to measure latency this test doesn't really make > sense, > > > you > > > > >> need to measure with some fixed throughput. Instead of -1 enter > the > > > > target > > > > >> throughput you want to measure latency at (e.g. 10 > records/sec). > > > > >> > > > > >> -Jay > > > > >> > > > > >> On Thu, Aug 13, 2015 at 12:18 PM, Yuheng Du < > > yuheng.du.h...@gmail.com > > > > > > > > >> wrote: > > > > >> > > > > >> > Thank you Alvaro, > > > > >> > > > > > >> > How to use sync producers? I am running the standard > > > > ProducerPerformance > > > > >> > test from kafka to measure the latency of each message to send > > from > > > > >> > producer to broker only. > > > > >> > The command is like "bin/kafka-run-class.sh > > > > >> > org.apache.kafka.clients.tools.ProducerPerformance test7 > 5000 > > > 100 > > > > -1 > > > > >> > acks=1 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092 > > > > >> > buffer.memory=67108864 batch.size=8196" > > > > >> > > > > > >> > For running producers
Re: Reduce latency
I see. Thank you Tao. But now I don't get it what Jay said that my latency test only makes sense if I set a fixed throughput. Why do I need to set a fixed throughput for my test instead of just set the expected throughput to be -1 (as much as possible)? Thanks. On Tue, Aug 18, 2015 at 2:43 PM, Tao Feng wrote: > Hi Yuheng, > > The 1 record/s is just a param for producerperformance for your > producer target tput. It only takes effect to do the throttling if you > tries to send more than 1 record/s. The actual tput of the test > depends on your producer config and your setup. > > -Tao > > On Tue, Aug 18, 2015 at 11:34 AM, Yuheng Du > wrote: > > > Also, When I set the target throughput to be 1 records/s, The actual > > test results show I got an average of 579.86 records per second among all > > my producers. How did that happen? Why this number is not 1 then? > > Thanks. > > > > On Tue, Aug 18, 2015 at 10:03 AM, Yuheng Du > > wrote: > > > > > Thank you Jay, that really helps! > > > > > > Kishore, Where you can monitor whether the network is busy on IO in > > visual > > > vm? Thanks. I am running 90 producer process on 90 physical machines in > > the > > > experiment. > > > > > > On Tue, Aug 18, 2015 at 1:19 AM, Jay Kreps wrote: > > > > > >> Yuheng, > > >> > > >> From the command you gave it looks like you are configuring the perf > > test > > >> to send data as fast as possible (the -1 for target throughput). This > > >> means > > >> it will always queue up a bunch of unsent data until the buffer is > > >> exhausted and then block. The larger the buffer, the bigger the queue. > > >> This > > >> is where the latency comes from. This is exactly what you would expect > > and > > >> what the buffering is supposed to do. > > >> > > >> If you want to measure latency this test doesn't really make sense, > you > > >> need to measure with some fixed throughput. Instead of -1 enter the > > target > > >> throughput you want to measure latency at (e.g. 10 records/sec). > > >> > > >> -Jay > > >> > > >> On Thu, Aug 13, 2015 at 12:18 PM, Yuheng Du > > > >> wrote: > > >> > > >> > Thank you Alvaro, > > >> > > > >> > How to use sync producers? I am running the standard > > ProducerPerformance > > >> > test from kafka to measure the latency of each message to send from > > >> > producer to broker only. > > >> > The command is like "bin/kafka-run-class.sh > > >> > org.apache.kafka.clients.tools.ProducerPerformance test7 5000 > 100 > > -1 > > >> > acks=1 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092 > > >> > buffer.memory=67108864 batch.size=8196" > > >> > > > >> > For running producers, where should I put the producer.type=sync > > >> > configuration into? The config/server.properties? Also Does this > mean > > we > > >> > are using batch size of 1? Which version of Kafka are you using? > > >> > thanks. > > >> > > > >> > On Thu, Aug 13, 2015 at 3:01 PM, Alvaro Gareppe > > > >> > wrote: > > >> > > > >> > > Are you measuring latency as time between producer and consumer ? > > >> > > > > >> > > In that case, the ack shouldn't affect the latency, cause even > tough > > >> your > > >> > > producer is not going to wait for the ack, the consumer will only > > get > > >> the > > >> > > message after its commited in the server. > > >> > > > > >> > > About latency my best result occur with sync producers, but the > > >> > throughput > > >> > > is much lower in that case. > > >> > > > > >> > > About not flushing to disk I'm pretty sure that it's not an option > > in > > >> > kafka > > >> > > (correct me if I'm wrong) > > >> > > > > >> > > Regards, > > >> > > Alvaro Gareppe > > >> > > > > >> > > On Thu, Aug 13, 2015 at 12:59 PM, Yuheng Du < > > yuheng.du.h...@gmail.com > > >> > > > >> > > wrote: > > >> > > > > >> > > > Also, the latency results show no major difference when using > > ack=0 > > >> or > > >> > > > ack=1. Why is that? > > >> > > > > > >> > > > On Thu, Aug 13, 2015 at 11:51 AM, Yuheng Du < > > >> yuheng.du.h...@gmail.com> > > >> > > > wrote: > > >> > > > > > >> > > > > I am running an experiment where 92 producers is publishing > data > > >> > into 6 > > >> > > > > brokers and 10 consumer are reading online data > simultaneously. > > >> > > > > > > >> > > > > How should I do to reduce the latency? Currently when I run > the > > >> > > producer > > >> > > > > performance test the average latency is around 10s. > > >> > > > > > > >> > > > > Should I disable log.flush? How to do that? Thanks. > > >> > > > > > > >> > > > > > >> > > > > >> > > > > >> > > > > >> > > -- > > >> > > Ing. Alvaro Gareppe > > >> > > agare...@gmail.com > > >> > > > > >> > > > >> > > > > > > > > >
Re: Reduce latency
Also, When I set the target throughput to be 1 records/s, The actual test results show I got an average of 579.86 records per second among all my producers. How did that happen? Why this number is not 1 then? Thanks. On Tue, Aug 18, 2015 at 10:03 AM, Yuheng Du wrote: > Thank you Jay, that really helps! > > Kishore, Where you can monitor whether the network is busy on IO in visual > vm? Thanks. I am running 90 producer process on 90 physical machines in the > experiment. > > On Tue, Aug 18, 2015 at 1:19 AM, Jay Kreps wrote: > >> Yuheng, >> >> From the command you gave it looks like you are configuring the perf test >> to send data as fast as possible (the -1 for target throughput). This >> means >> it will always queue up a bunch of unsent data until the buffer is >> exhausted and then block. The larger the buffer, the bigger the queue. >> This >> is where the latency comes from. This is exactly what you would expect and >> what the buffering is supposed to do. >> >> If you want to measure latency this test doesn't really make sense, you >> need to measure with some fixed throughput. Instead of -1 enter the target >> throughput you want to measure latency at (e.g. 10 records/sec). >> >> -Jay >> >> On Thu, Aug 13, 2015 at 12:18 PM, Yuheng Du >> wrote: >> >> > Thank you Alvaro, >> > >> > How to use sync producers? I am running the standard ProducerPerformance >> > test from kafka to measure the latency of each message to send from >> > producer to broker only. >> > The command is like "bin/kafka-run-class.sh >> > org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1 >> > acks=1 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092 >> > buffer.memory=67108864 batch.size=8196" >> > >> > For running producers, where should I put the producer.type=sync >> > configuration into? The config/server.properties? Also Does this mean we >> > are using batch size of 1? Which version of Kafka are you using? >> > thanks. >> > >> > On Thu, Aug 13, 2015 at 3:01 PM, Alvaro Gareppe >> > wrote: >> > >> > > Are you measuring latency as time between producer and consumer ? >> > > >> > > In that case, the ack shouldn't affect the latency, cause even tough >> your >> > > producer is not going to wait for the ack, the consumer will only get >> the >> > > message after its commited in the server. >> > > >> > > About latency my best result occur with sync producers, but the >> > throughput >> > > is much lower in that case. >> > > >> > > About not flushing to disk I'm pretty sure that it's not an option in >> > kafka >> > > (correct me if I'm wrong) >> > > >> > > Regards, >> > > Alvaro Gareppe >> > > >> > > On Thu, Aug 13, 2015 at 12:59 PM, Yuheng Du > > >> > > wrote: >> > > >> > > > Also, the latency results show no major difference when using ack=0 >> or >> > > > ack=1. Why is that? >> > > > >> > > > On Thu, Aug 13, 2015 at 11:51 AM, Yuheng Du < >> yuheng.du.h...@gmail.com> >> > > > wrote: >> > > > >> > > > > I am running an experiment where 92 producers is publishing data >> > into 6 >> > > > > brokers and 10 consumer are reading online data simultaneously. >> > > > > >> > > > > How should I do to reduce the latency? Currently when I run the >> > > producer >> > > > > performance test the average latency is around 10s. >> > > > > >> > > > > Should I disable log.flush? How to do that? Thanks. >> > > > > >> > > > >> > > >> > > >> > > >> > > -- >> > > Ing. Alvaro Gareppe >> > > agare...@gmail.com >> > > >> > >> > >
Re: Reduce latency
Thank you Jay, that really helps! Kishore, Where you can monitor whether the network is busy on IO in visual vm? Thanks. I am running 90 producer process on 90 physical machines in the experiment. On Tue, Aug 18, 2015 at 1:19 AM, Jay Kreps wrote: > Yuheng, > > From the command you gave it looks like you are configuring the perf test > to send data as fast as possible (the -1 for target throughput). This means > it will always queue up a bunch of unsent data until the buffer is > exhausted and then block. The larger the buffer, the bigger the queue. This > is where the latency comes from. This is exactly what you would expect and > what the buffering is supposed to do. > > If you want to measure latency this test doesn't really make sense, you > need to measure with some fixed throughput. Instead of -1 enter the target > throughput you want to measure latency at (e.g. 10 records/sec). > > -Jay > > On Thu, Aug 13, 2015 at 12:18 PM, Yuheng Du > wrote: > > > Thank you Alvaro, > > > > How to use sync producers? I am running the standard ProducerPerformance > > test from kafka to measure the latency of each message to send from > > producer to broker only. > > The command is like "bin/kafka-run-class.sh > > org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1 > > acks=1 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092 > > buffer.memory=67108864 batch.size=8196" > > > > For running producers, where should I put the producer.type=sync > > configuration into? The config/server.properties? Also Does this mean we > > are using batch size of 1? Which version of Kafka are you using? > > thanks. > > > > On Thu, Aug 13, 2015 at 3:01 PM, Alvaro Gareppe > > wrote: > > > > > Are you measuring latency as time between producer and consumer ? > > > > > > In that case, the ack shouldn't affect the latency, cause even tough > your > > > producer is not going to wait for the ack, the consumer will only get > the > > > message after its commited in the server. > > > > > > About latency my best result occur with sync producers, but the > > throughput > > > is much lower in that case. > > > > > > About not flushing to disk I'm pretty sure that it's not an option in > > kafka > > > (correct me if I'm wrong) > > > > > > Regards, > > > Alvaro Gareppe > > > > > > On Thu, Aug 13, 2015 at 12:59 PM, Yuheng Du > > > wrote: > > > > > > > Also, the latency results show no major difference when using ack=0 > or > > > > ack=1. Why is that? > > > > > > > > On Thu, Aug 13, 2015 at 11:51 AM, Yuheng Du < > yuheng.du.h...@gmail.com> > > > > wrote: > > > > > > > > > I am running an experiment where 92 producers is publishing data > > into 6 > > > > > brokers and 10 consumer are reading online data simultaneously. > > > > > > > > > > How should I do to reduce the latency? Currently when I run the > > > producer > > > > > performance test the average latency is around 10s. > > > > > > > > > > Should I disable log.flush? How to do that? Thanks. > > > > > > > > > > > > > > > > > > > > > -- > > > Ing. Alvaro Gareppe > > > agare...@gmail.com > > > > > >
Re: Reduce latency
Thank you Kishore, I made the buffer twice the size of the batch size and the latency has reduced significantly. But is there only one thread io thread sending the batches? Can I increase the number of threads sending the batches so more than one batch could be sent at the same time? Thanks. On Thu, Aug 13, 2015 at 5:38 PM, Kishore Senji wrote: > Your batch.size is 8196 and your buffer.memory is 67108864. This means > 67108864/8196 > ~ 8188 batches are in memory ready to the sent. There is only one thread io > thread sending them. I would guess that the io thread ( > kafka-producer-network-thread) would be busy. Please check it in visual vm. > > In steady state, every batch has to wait for the previous 8187 batches to > be done before it gets a chance to be sent out, but the latency is counted > from the time is added to the queue. This is the reason that you are seeing > very high end-to-end latency. > > Have the buffer.memory to be only twice that of the batch.size so that > while one is in flight, you can another batch ready to go (and the > KafkaProducer would block to send more when there is no memory and this way > the batches are not waiting in the queue unnecessarily) . Also may be you > want to increase the batch.size further more, you will get even better > throughput with more or less same latency (as there is no shortage of > events in the test program). > > On Thu, Aug 13, 2015 at 1:13 PM Yuheng Du > wrote: > > > Yes there is. But if we are using ProducerPerformance test, it's > configured > > as giving input when running the test command. Do you write a java > program > > to test the latency? Thanks. > > > > On Thu, Aug 13, 2015 at 3:54 PM, Alvaro Gareppe > > wrote: > > > > > I'm using last one, but not using the ProducerPerformance, I created my > > > own. but I think there is a producer.properties file in config folder > in > > > kafka.. is that configuration not for this tester ? > > > > > > On Thu, Aug 13, 2015 at 4:18 PM, Yuheng Du > > > wrote: > > > > > > > Thank you Alvaro, > > > > > > > > How to use sync producers? I am running the standard > > ProducerPerformance > > > > test from kafka to measure the latency of each message to send from > > > > producer to broker only. > > > > The command is like "bin/kafka-run-class.sh > > > > org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 > > -1 > > > > acks=1 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092 > > > > buffer.memory=67108864 batch.size=8196" > > > > > > > > For running producers, where should I put the producer.type=sync > > > > configuration into? The config/server.properties? Also Does this mean > > we > > > > are using batch size of 1? Which version of Kafka are you using? > > > > thanks. > > > > > > > > On Thu, Aug 13, 2015 at 3:01 PM, Alvaro Gareppe > > > > wrote: > > > > > > > > > Are you measuring latency as time between producer and consumer ? > > > > > > > > > > In that case, the ack shouldn't affect the latency, cause even > tough > > > your > > > > > producer is not going to wait for the ack, the consumer will only > get > > > the > > > > > message after its commited in the server. > > > > > > > > > > About latency my best result occur with sync producers, but the > > > > throughput > > > > > is much lower in that case. > > > > > > > > > > About not flushing to disk I'm pretty sure that it's not an option > in > > > > kafka > > > > > (correct me if I'm wrong) > > > > > > > > > > Regards, > > > > > Alvaro Gareppe > > > > > > > > > > On Thu, Aug 13, 2015 at 12:59 PM, Yuheng Du < > > yuheng.du.h...@gmail.com> > > > > > wrote: > > > > > > > > > > > Also, the latency results show no major difference when using > ack=0 > > > or > > > > > > ack=1. Why is that? > > > > > > > > > > > > On Thu, Aug 13, 2015 at 11:51 AM, Yuheng Du < > > > yuheng.du.h...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > I am running an experiment where 92 producers is publishing > data > > > > into 6 > > > > > > > brokers and 10 consumer are reading online data simultaneously. > > > > > > > > > > > > > > How should I do to reduce the latency? Currently when I run the > > > > > producer > > > > > > > performance test the average latency is around 10s. > > > > > > > > > > > > > > Should I disable log.flush? How to do that? Thanks. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Ing. Alvaro Gareppe > > > > > agare...@gmail.com > > > > > > > > > > > > > > > > > > > > > -- > > > Ing. Alvaro Gareppe > > > agare...@gmail.com > > > > > >
Re: use page cache as much as possiblee
Thank you Kishore, I see that the end-to-end latency may not be reduced by resetting the flush time manually. But if the default flush.ms is Long.Max_Value, why I see the disk usage of the brokers constantly increasing when the producer is pushing in data? Should that be happen once a while? The os page cache usage should not be reflected when using "watch df -h" command, am I correct? Thanks. On Fri, Aug 14, 2015 at 10:12 PM, Kishore Senji wrote: > Actually in 0.8.2, flush.ms & flush.messages are recommended to be left > defaults (Long.MAX_VALUE) > http://kafka.apache.org/documentation.html (search for flush.ms) > > The disk flush and the committed offset are two independent things. As long > as you have replication, the recommended thing is to leave the flushing to > the OS. But if you choose to flush manually the time interval at which you > flush may not influence the end-to-end latency from the Producer to > Consumer, however it can influence the throughput of the broker. > > On Fri, Aug 14, 2015 at 9:20 AM Yuheng Du > wrote: > > > So if I understand correctly, even if I delay flushing, the consumer will > > get the messages as soon as the broker receives them and put them into > page > > cache (assuming producer doesn't wait for acks from brokers)? > > > > And will the decrease of log.flush interval help reduce latency between > > producer and consumer? > > > > Thanks. > > > > > > On Fri, Aug 14, 2015 at 11:57 AM, Kishore Senji > wrote: > > > > > Thank you Gwen for correcting me. This document ( > > > https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Replication) > in > > > "Writes" section also has specified the same thing as you have > mentioned. > > > One thing is not clear to me as to what happens when the Replicas add > the > > > message to memory but the leader fails before acking to the producer. > > Later > > > the leader replica is chosen to be the leader for the partition, it > will > > > advance the HW to its LEO (which has the message). The producer can > > resend > > > the same message thinking it failed and there will be a duplicate > > message. > > > Is my understanding correct here? > > > > > > On Thu, Aug 13, 2015 at 10:50 PM, Gwen Shapira > > wrote: > > > > > > > On Thu, Aug 13, 2015 at 4:10 PM, Kishore Senji > > wrote: > > > > > > > > > Consumers can only fetch data up to the committed offset and the > > reason > > > > is > > > > > reliability and durability on a broker crash (some consumers might > > get > > > > the > > > > > new data and some may not as the data is not yet committed and > lost). > > > > Data > > > > > will be committed when it is flushed. So if you delay the flushing, > > > > > consumers won't get those messages until that time. > > > > > > > > > > > > > As far as I know, this is not accurate. > > > > > > > > A message is considered committed when all ISR replicas received it > > (this > > > > much is documented). This doesn't need to include writing to disk, > > which > > > > will happen asynchronously. > > > > > > > > > > > > > > > > > > Even though you flush periodically based on > > log.flush.interval.messages > > > > and > > > > > log.flush.interval.ms, if the segment file is in the pagecache, > the > > > > > consumers will still benefit from that pagecache and OS wouldn't > read > > > it > > > > > again from disk. > > > > > > > > > > On Thu, Aug 13, 2015 at 2:54 PM Yuheng Du < > yuheng.du.h...@gmail.com> > > > > > wrote: > > > > > > > > > > > Hi, > > > > > > > > > > > > As I understand it, kafka brokers will store the incoming > messages > > > into > > > > > > pagecache as much as possible and then flush them into disk, > right? > > > > > > > > > > > > But in my experiment where 90 producers is publishing data into 6 > > > > > brokers, > > > > > > I see that the log directory on disk where broker stores the data > > is > > > > > > constantly increasing (every seconds.) So why this is happening? > > Does > > > > > this > > > > > > has to do with the default "log.flush.interval" setting? > > > > > > > > > > > > I want the broker to write to disk less often when serving some > > > on-line > > > > > > consumers to reduce latency. I tested in my broker the disk write > > > speed > > > > > is > > > > > > around 110MB/s. > > > > > > > > > > > > Thanks for any replies. > > > > > > > > > > > > > > > > > > > > >
Re: use page cache as much as possiblee
So if I understand correctly, even if I delay flushing, the consumer will get the messages as soon as the broker receives them and put them into page cache (assuming producer doesn't wait for acks from brokers)? And will the decrease of log.flush interval help reduce latency between producer and consumer? Thanks. On Fri, Aug 14, 2015 at 11:57 AM, Kishore Senji wrote: > Thank you Gwen for correcting me. This document ( > https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Replication) in > "Writes" section also has specified the same thing as you have mentioned. > One thing is not clear to me as to what happens when the Replicas add the > message to memory but the leader fails before acking to the producer. Later > the leader replica is chosen to be the leader for the partition, it will > advance the HW to its LEO (which has the message). The producer can resend > the same message thinking it failed and there will be a duplicate message. > Is my understanding correct here? > > On Thu, Aug 13, 2015 at 10:50 PM, Gwen Shapira wrote: > > > On Thu, Aug 13, 2015 at 4:10 PM, Kishore Senji wrote: > > > > > Consumers can only fetch data up to the committed offset and the reason > > is > > > reliability and durability on a broker crash (some consumers might get > > the > > > new data and some may not as the data is not yet committed and lost). > > Data > > > will be committed when it is flushed. So if you delay the flushing, > > > consumers won't get those messages until that time. > > > > > > > As far as I know, this is not accurate. > > > > A message is considered committed when all ISR replicas received it (this > > much is documented). This doesn't need to include writing to disk, which > > will happen asynchronously. > > > > > > > > > > Even though you flush periodically based on log.flush.interval.messages > > and > > > log.flush.interval.ms, if the segment file is in the pagecache, the > > > consumers will still benefit from that pagecache and OS wouldn't read > it > > > again from disk. > > > > > > On Thu, Aug 13, 2015 at 2:54 PM Yuheng Du > > > wrote: > > > > > > > Hi, > > > > > > > > As I understand it, kafka brokers will store the incoming messages > into > > > > pagecache as much as possible and then flush them into disk, right? > > > > > > > > But in my experiment where 90 producers is publishing data into 6 > > > brokers, > > > > I see that the log directory on disk where broker stores the data is > > > > constantly increasing (every seconds.) So why this is happening? Does > > > this > > > > has to do with the default "log.flush.interval" setting? > > > > > > > > I want the broker to write to disk less often when serving some > on-line > > > > consumers to reduce latency. I tested in my broker the disk write > speed > > > is > > > > around 110MB/s. > > > > > > > > Thanks for any replies. > > > > > > > > > >
use page cache as much as possiblee
Hi, As I understand it, kafka brokers will store the incoming messages into pagecache as much as possible and then flush them into disk, right? But in my experiment where 90 producers is publishing data into 6 brokers, I see that the log directory on disk where broker stores the data is constantly increasing (every seconds.) So why this is happening? Does this has to do with the default "log.flush.interval" setting? I want the broker to write to disk less often when serving some on-line consumers to reduce latency. I tested in my broker the disk write speed is around 110MB/s. Thanks for any replies.
Re: Reduce latency
Yes there is. But if we are using ProducerPerformance test, it's configured as giving input when running the test command. Do you write a java program to test the latency? Thanks. On Thu, Aug 13, 2015 at 3:54 PM, Alvaro Gareppe wrote: > I'm using last one, but not using the ProducerPerformance, I created my > own. but I think there is a producer.properties file in config folder in > kafka.. is that configuration not for this tester ? > > On Thu, Aug 13, 2015 at 4:18 PM, Yuheng Du > wrote: > > > Thank you Alvaro, > > > > How to use sync producers? I am running the standard ProducerPerformance > > test from kafka to measure the latency of each message to send from > > producer to broker only. > > The command is like "bin/kafka-run-class.sh > > org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1 > > acks=1 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092 > > buffer.memory=67108864 batch.size=8196" > > > > For running producers, where should I put the producer.type=sync > > configuration into? The config/server.properties? Also Does this mean we > > are using batch size of 1? Which version of Kafka are you using? > > thanks. > > > > On Thu, Aug 13, 2015 at 3:01 PM, Alvaro Gareppe > > wrote: > > > > > Are you measuring latency as time between producer and consumer ? > > > > > > In that case, the ack shouldn't affect the latency, cause even tough > your > > > producer is not going to wait for the ack, the consumer will only get > the > > > message after its commited in the server. > > > > > > About latency my best result occur with sync producers, but the > > throughput > > > is much lower in that case. > > > > > > About not flushing to disk I'm pretty sure that it's not an option in > > kafka > > > (correct me if I'm wrong) > > > > > > Regards, > > > Alvaro Gareppe > > > > > > On Thu, Aug 13, 2015 at 12:59 PM, Yuheng Du > > > wrote: > > > > > > > Also, the latency results show no major difference when using ack=0 > or > > > > ack=1. Why is that? > > > > > > > > On Thu, Aug 13, 2015 at 11:51 AM, Yuheng Du < > yuheng.du.h...@gmail.com> > > > > wrote: > > > > > > > > > I am running an experiment where 92 producers is publishing data > > into 6 > > > > > brokers and 10 consumer are reading online data simultaneously. > > > > > > > > > > How should I do to reduce the latency? Currently when I run the > > > producer > > > > > performance test the average latency is around 10s. > > > > > > > > > > Should I disable log.flush? How to do that? Thanks. > > > > > > > > > > > > > > > > > > > > > -- > > > Ing. Alvaro Gareppe > > > agare...@gmail.com > > > > > > > > > -- > Ing. Alvaro Gareppe > agare...@gmail.com >
Re: Reduce latency
Thank you Alvaro, How to use sync producers? I am running the standard ProducerPerformance test from kafka to measure the latency of each message to send from producer to broker only. The command is like "bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1 acks=1 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196" For running producers, where should I put the producer.type=sync configuration into? The config/server.properties? Also Does this mean we are using batch size of 1? Which version of Kafka are you using? thanks. On Thu, Aug 13, 2015 at 3:01 PM, Alvaro Gareppe wrote: > Are you measuring latency as time between producer and consumer ? > > In that case, the ack shouldn't affect the latency, cause even tough your > producer is not going to wait for the ack, the consumer will only get the > message after its commited in the server. > > About latency my best result occur with sync producers, but the throughput > is much lower in that case. > > About not flushing to disk I'm pretty sure that it's not an option in kafka > (correct me if I'm wrong) > > Regards, > Alvaro Gareppe > > On Thu, Aug 13, 2015 at 12:59 PM, Yuheng Du > wrote: > > > Also, the latency results show no major difference when using ack=0 or > > ack=1. Why is that? > > > > On Thu, Aug 13, 2015 at 11:51 AM, Yuheng Du > > wrote: > > > > > I am running an experiment where 92 producers is publishing data into 6 > > > brokers and 10 consumer are reading online data simultaneously. > > > > > > How should I do to reduce the latency? Currently when I run the > producer > > > performance test the average latency is around 10s. > > > > > > Should I disable log.flush? How to do that? Thanks. > > > > > > > > > -- > Ing. Alvaro Gareppe > agare...@gmail.com >
Reduce latency
I am running an experiment where 92 producers is publishing data into 6 brokers and 10 consumer are reading online data simultaneously. How should I do to reduce the latency? Currently when I run the producer performance test the average latency is around 10s. Should I disable log.flush? How to do that? Thanks.
Re: Reduce latency
Also, the latency results show no major difference when using ack=0 or ack=1. Why is that? On Thu, Aug 13, 2015 at 11:51 AM, Yuheng Du wrote: > I am running an experiment where 92 producers is publishing data into 6 > brokers and 10 consumer are reading online data simultaneously. > > How should I do to reduce the latency? Currently when I run the producer > performance test the average latency is around 10s. > > Should I disable log.flush? How to do that? Thanks. >
Variation of producer latency in ProducerPerformance test
Hi, I am running a test which 92 producers each publish 53000 records of size 254 bytes to 2 brokers. The average latency shown in each producer has high variations. For some producer, the average latency is as low as 38ms to send the 53000 records; but for some producer, the average latency is as high as 13067ms. How to explain this problem? To get the lowest latencies, what batch size and other important configs should I use? Thanks!
Kafka vs RabbitMQ latency
Hi guys, I was reading a paper today in which the latency of kafka and rabbitmq is compared: http://downloads.hindawi.com/journals/js/2015/468047.pdf To my surprise, kafka has shown some large variations of latency as the number of records per second increases. So I am curious about why is that. Also in the ProducerPerformanceTest: in/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1 *acks=1* bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196 Setting acks = 1 means the producer will wait for ack from leader replica, right? Could that be the reason which affects latency? If I set it to 0, it will make the producers send as fast as possible therefore the throughput can increase and latency decrease in the test results? Thanks for answering. best,
Re: multiple producer throughput
The message size is 100 bytes and each producer sends out 50million messages. It's the number used by the "benchmarking kafka" post. http://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines Thanks. On Mon, Jul 27, 2015 at 4:15 PM, Prabhjot Bharaj wrote: > Hi, > > Have you tried with acks=1 and -1 as well? > Please share the numbers and the message size > > Regards, > Prabcs > On Jul 27, 2015 10:24 PM, "Yuheng Du" wrote: > > > Hi, > > > > I am running 40 producers on 40 nodes cluster. The messages are sent to 6 > > brokers in another cluster. The producers are running ProducerPerformance > > test. > > > > When 20 nodes are running, the throughput is around 13MB/s and when > running > > 40 nodes, the throughput is around 9MB/s. > > > > I have set log.retention.ms=9000 to delete the unwanted messages, just > to > > avoid the disk space to be filled. > > > > So I want to know how should I tune the system to get better throughput > > result? Thanks. > > >
Re: deleting data automatically
Thank you! On Mon, Jul 27, 2015 at 1:43 PM, Ewen Cheslack-Postava wrote: > As I mentioned, adjusting any settings such that files are small enough > that you don't get the benefits of append-only writes or file > creation/deletion become a bottleneck might affect performance. It looks > like the default setting for log.segment.bytes is 1GB, so given fast enough > cleanup of old logs, you may not need to adjust that setting -- assuming > you have a reasonable amount of storage, you'll easily fit many dozen log > files of that size. > > -Ewen > > On Mon, Jul 27, 2015 at 10:36 AM, Yuheng Du > wrote: > > > Thank you! what performance impacts will it be if I change > > log.segment.bytes? Thanks. > > > > On Mon, Jul 27, 2015 at 1:25 PM, Ewen Cheslack-Postava < > e...@confluent.io> > > wrote: > > > > > I think log.cleanup.interval.mins was removed in the first 0.8 release. > > It > > > sounds like you're looking at outdated docs. Search for > > > log.retention.check.interval.ms here: > > > http://kafka.apache.org/documentation.html > > > > > > As for setting the values too low hurting performance, I'd guess it's > > > probably only an issue if you set them extremely small, such that file > > > creation and cleanup become a bottleneck. > > > > > > -Ewen > > > > > > On Mon, Jul 27, 2015 at 10:03 AM, Yuheng Du > > > wrote: > > > > > > > If I want to get higher throughput, should I increase the > > > > log.segment.bytes? > > > > > > > > I don't see log.retention.check.interval.ms, but there is > > > > log.cleanup.interval.mins, is that what you mean? > > > > > > > > If I set log.roll.ms or log.cleanup.interval.mins too small, will it > > > hurt > > > > the throughput? Thanks. > > > > > > > > On Fri, Jul 24, 2015 at 11:03 PM, Ewen Cheslack-Postava < > > > e...@confluent.io > > > > > > > > > wrote: > > > > > > > > > You'll want to set the log retention policy via > > > > > log.retention.{ms,minutes,hours} or log.retention.bytes. If you > want > > > > really > > > > > aggressive collection (e.g., on the order of seconds, as you > > > specified), > > > > > you might also need to adjust log.segment.bytes/log.roll.{ms,hours} > > and > > > > > log.retention.check.interval.ms. > > > > > > > > > > On Fri, Jul 24, 2015 at 12:49 PM, Yuheng Du < > > yuheng.du.h...@gmail.com> > > > > > wrote: > > > > > > > > > > > Hi, > > > > > > > > > > > > I am testing the kafka producer performance. So I created a queue > > and > > > > > > writes a large amount of data to that queue. > > > > > > > > > > > > Is there a way to delete the data automatically after some time, > > say > > > > > > whenever the data size reaches 50GB or the retention time exceeds > > 10 > > > > > > seconds, it will be deleted so my disk won't get filled and new > > data > > > > > can't > > > > > > be written in? > > > > > > > > > > > > Thanks.! > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Thanks, > > > > > Ewen > > > > > > > > > > > > > > > > > > > > > -- > > > Thanks, > > > Ewen > > > > > > > > > -- > Thanks, > Ewen >
Re: deleting data automatically
Thank you! what performance impacts will it be if I change log.segment.bytes? Thanks. On Mon, Jul 27, 2015 at 1:25 PM, Ewen Cheslack-Postava wrote: > I think log.cleanup.interval.mins was removed in the first 0.8 release. It > sounds like you're looking at outdated docs. Search for > log.retention.check.interval.ms here: > http://kafka.apache.org/documentation.html > > As for setting the values too low hurting performance, I'd guess it's > probably only an issue if you set them extremely small, such that file > creation and cleanup become a bottleneck. > > -Ewen > > On Mon, Jul 27, 2015 at 10:03 AM, Yuheng Du > wrote: > > > If I want to get higher throughput, should I increase the > > log.segment.bytes? > > > > I don't see log.retention.check.interval.ms, but there is > > log.cleanup.interval.mins, is that what you mean? > > > > If I set log.roll.ms or log.cleanup.interval.mins too small, will it > hurt > > the throughput? Thanks. > > > > On Fri, Jul 24, 2015 at 11:03 PM, Ewen Cheslack-Postava < > e...@confluent.io > > > > > wrote: > > > > > You'll want to set the log retention policy via > > > log.retention.{ms,minutes,hours} or log.retention.bytes. If you want > > really > > > aggressive collection (e.g., on the order of seconds, as you > specified), > > > you might also need to adjust log.segment.bytes/log.roll.{ms,hours} and > > > log.retention.check.interval.ms. > > > > > > On Fri, Jul 24, 2015 at 12:49 PM, Yuheng Du > > > wrote: > > > > > > > Hi, > > > > > > > > I am testing the kafka producer performance. So I created a queue and > > > > writes a large amount of data to that queue. > > > > > > > > Is there a way to delete the data automatically after some time, say > > > > whenever the data size reaches 50GB or the retention time exceeds 10 > > > > seconds, it will be deleted so my disk won't get filled and new data > > > can't > > > > be written in? > > > > > > > > Thanks.! > > > > > > > > > > > > > > > > -- > > > Thanks, > > > Ewen > > > > > > > > > -- > Thanks, > Ewen >
Re: deleting data automatically
If I want to get higher throughput, should I increase the log.segment.bytes? I don't see log.retention.check.interval.ms, but there is log.cleanup.interval.mins, is that what you mean? If I set log.roll.ms or log.cleanup.interval.mins too small, will it hurt the throughput? Thanks. On Fri, Jul 24, 2015 at 11:03 PM, Ewen Cheslack-Postava wrote: > You'll want to set the log retention policy via > log.retention.{ms,minutes,hours} or log.retention.bytes. If you want really > aggressive collection (e.g., on the order of seconds, as you specified), > you might also need to adjust log.segment.bytes/log.roll.{ms,hours} and > log.retention.check.interval.ms. > > On Fri, Jul 24, 2015 at 12:49 PM, Yuheng Du > wrote: > > > Hi, > > > > I am testing the kafka producer performance. So I created a queue and > > writes a large amount of data to that queue. > > > > Is there a way to delete the data automatically after some time, say > > whenever the data size reaches 50GB or the retention time exceeds 10 > > seconds, it will be deleted so my disk won't get filled and new data > can't > > be written in? > > > > Thanks.! > > > > > > -- > Thanks, > Ewen >
multiple producer throughput
Hi, I am running 40 producers on 40 nodes cluster. The messages are sent to 6 brokers in another cluster. The producers are running ProducerPerformance test. When 20 nodes are running, the throughput is around 13MB/s and when running 40 nodes, the throughput is around 9MB/s. I have set log.retention.ms=9000 to delete the unwanted messages, just to avoid the disk space to be filled. So I want to know how should I tune the system to get better throughput result? Thanks.
deleting data automatically
Hi, I am testing the kafka producer performance. So I created a queue and writes a large amount of data to that queue. Is there a way to delete the data automatically after some time, say whenever the data size reaches 50GB or the retention time exceeds 10 seconds, it will be deleted so my disk won't get filled and new data can't be written in? Thanks.!
Re: properducertest on multiple nodes
I deleted the queue and recreated it before I run the test. Things are working after restart the broker cluster, thanks! On Fri, Jul 24, 2015 at 12:06 PM, Gwen Shapira wrote: > Does topic "speedx1" exist? > > On Fri, Jul 24, 2015 at 7:09 AM, Yuheng Du > wrote: > > Hi, > > > > I am trying to run 20 performance test on 10 nodes using pbsdsh. > > > > The messages will send to a 6 brokers cluster. It seems to work for a > > while. When I delete the test queue and rerun the test, the broker does > not > > seem to process incoming messages: > > > > [yuhengd@node1739 kafka_2.10-0.8.2.1]$ bin/kafka-run-class.sh > > org.apache.kafka.clients.tools.ProducerPerformance speedx1 5000 100 > -1 > > acks=1 bootstrap.servers=130.127.133.72:9092 buffer.memory=67108864 > > batch.size=8196 > > > > 1 records sent, 0.0 records/sec (0.00 MB/sec), 6.0 ms avg latency, > > 6.0 max latency. > > > > org.apache.kafka.common.errors.TimeoutException: Failed to update > metadata > > after 79 ms. > > > > > > Can anyone suggest what the problem is? Or what configuration should I > > change? Thanks! >
properducertest on multiple nodes
Hi, I am trying to run 20 performance test on 10 nodes using pbsdsh. The messages will send to a 6 brokers cluster. It seems to work for a while. When I delete the test queue and rerun the test, the broker does not seem to process incoming messages: [yuhengd@node1739 kafka_2.10-0.8.2.1]$ bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance speedx1 5000 100 -1 acks=1 bootstrap.servers=130.127.133.72:9092 buffer.memory=67108864 batch.size=8196 1 records sent, 0.0 records/sec (0.00 MB/sec), 6.0 ms avg latency, 6.0 max latency. org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 79 ms. Can anyone suggest what the problem is? Or what configuration should I change? Thanks!
Re: broker data directory
Thank you, Nicolas! On Tue, Jul 21, 2015 at 10:46 AM, Nicolas Phung wrote: > Yes indeed. > > # A comma seperated list of directories under which to store log files > log.dirs=/var/lib/kafka > > You can put several disk/partitions too. > > Regards, > > On Tue, Jul 21, 2015 at 4:37 PM, Yuheng Du > wrote: > > > Just wanna make sure, in server.properties, the configuration > > log.dirs=/tmp/kafka-logs > > > > specifies the directory of where the log (data) stores, right? > > > > If I want the data to be saved elsewhere, this is the configuration I > need > > to change, right? > > > > Thanks for answering. > > > > best, > > >
broker data directory
Just wanna make sure, in server.properties, the configuration log.dirs=/tmp/kafka-logs specifies the directory of where the log (data) stores, right? If I want the data to be saved elsewhere, this is the configuration I need to change, right? Thanks for answering. best,
Re: latency performance test
Hi Ewen, Thank you for your patient explaining. It is very helpful. Can we assume that the long latency of ProducerPerformance comes from queuing delay in the buffer and it is related to buffer size? Thank you! best, Yuheng On Thu, Jul 16, 2015 at 12:21 AM, Ewen Cheslack-Postava wrote: > The tests are meant to evaluate different things and the way they send > messages is the source of the difference. > > EndToEndLatency works with a single message at a time. It produces the > message then waits for the consumer to receive it. This approach guarantees > there is no delay due to queuing. The goal with this test is to evaluate > the *minimum* latency. > > ProducerPerformance focuses on achieving maximum throughput. This means it > will enqueue lots of records so it will always have more data to send (and > can use batching to increase the throughput). Unlike EndToEndLatency, this > means records may just sit in a queue on the producer for awhile because > the maximum number of in flight requests has been reached and it needs to > wait for responses for those requests. Since EndToEndLatency only ever has > one record outstanding, it will never encounter this case. > > Batching itself doesn't increase the latency because it only occurs when > the producer is either a) already unable to send messages anyway or b) > linger.ms is greater than 0, but the tests use the default setting that > doesn't linger at all. > > In your example for ProducerPerformance, you have 100 byte records and will > buffer up to 64MB. Given the batch size of 8K and default producer settings > of 5 in flight requests, you can roughly think of one round trip time > handling 5 * 8K = 40K bytes of data. If your roundtrip is 1ms, then if your > buffer is full at 64MB it will take you 64 MB / (40 KB/ms) = 1638ms = 1.6s. > That means that the record that was added at the end of the buffer had to > just sit in the buffer for 1.6s before it was sent off to the broker. And > if your buffer is consistently full (which it should be for > ProducerPerformance since it's sending as fast as it can), that means > *every* record waits that long. > > Of course, these numbers are estimates, depend on my having used 1ms, but > hopefully should make it clear why you can see relatively large latencies. > > -Ewen > > > On Wed, Jul 15, 2015 at 1:38 AM, Yuheng Du > wrote: > > > Hi, > > > > I have run the end to end latency test and the producerPerformance test > on > > my kafka cluster according to > > https://gist.github.com/jkreps/c7ddb4041ef62a900e6c > > > > In end to end latency test, the latency was around 2ms. In > > producerperformance test, if use batch size 8196 to send 50,000,000 > > records: > > > > >bin/kafka-run-class.sh > org.apache.kafka.clients.tools.ProducerPerformance > > speedx1 5000 100 -1 acks=1 bootstrap.servers=192.168.1.1:9092 > > buffer.memory=67108864 batch.size=8196 > > > > > > The results show that max latency is 3617ms, avg latency 626.7ms. I wanna > > know why the latency in producerperformance test is significantly larger > > than end to end test? Is it because of batching? Are the definitons of > > these two latencies different? I looked at the source code and I believe > > the latency is measure for the producer.send() function to complete. So > > does this latency includes transmission delay, transferring delay, and > what > > other components? > > > > > > Thanks. > > > > > > best, > > > > Yuheng > > > > > > -- > Thanks, > Ewen >
Re: Latency test
Thank you, Tao! On Jul 15, 2015 6:27 PM, "Tao Feng" wrote: > Sorry Yufeng, You should change it in $KAFKA_HEAP_OPTS. > > On Wed, Jul 15, 2015 at 3:09 PM, Tao Feng wrote: > > > Hi Yuheng, > > > > You could add the -Xmx1024m in > > https://github.com/apache/kafka/blob/trunk/bin/kafka-run-class.sh > > KAFKA_JVM_PERFORMANCE_OPTS. > > > > > > > > On Wed, Jul 15, 2015 at 12:51 AM, Yuheng Du > > wrote: > > > >> Tao, > >> > >> If I am running on the command line the following command > >> >bin/kafka-run-class.sh kafka.tools.TestEndToEndLatency > 192.168.1.3:9092 > >> 192.168.1.1:2181 speedx3 5000 100 1 -Xmx1024m > >> > >> It promped that it is not correct. So where should I put the -Xmx1024m > >> option? Thanks. > >> > >> > >> > >> On Wed, Jul 15, 2015 at 3:44 AM, Tao Feng wrote: > >> > >> > (Please correct me if I am wrong.) Based on TestEndToEndLatency( > >> > > >> > > >> > https://github.com/apache/kafka/blob/trunk/core/src/test/scala/kafka/tools/TestEndToEndLatency.scala > >> > ), > >> > consumer_fetch_max_wait corresponds to fetch.wait.max.ms in consumer > >> > config( > >> > http://kafka.apache.org/documentation.html#consumerconfigs). The > >> default > >> > value listed at document is 100(ms). > >> > > >> > To add java heap space to jvm, put -Xmx$Size(max heap size) for your > jvm > >> > option. > >> > > >> > On Wed, Jul 15, 2015 at 12:29 AM, Yuheng Du > > >> > wrote: > >> > > >> > > Tao, > >> > > > >> > > Thanks. The example on > >> > https://gist.github.com/jkreps/c7ddb4041ef62a900e6c > >> > > is outdated already. The error message shows: > >> > > > >> > > USAGE: java kafka.tools.TestEndToEndLatency$ broker_list > >> > zookeeper_connect > >> > > topic num_messages consumer_fetch_max_wait producer_acks > >> > > > >> > > > >> > > Can anyone helps me what should be put in consumer_fetch_max_wait? > >> > Thanks. > >> > > > >> > > On Tue, Jul 14, 2015 at 5:21 PM, Tao Feng > >> wrote: > >> > > > >> > > > I think ProducerPerformance microbenchmark only measure between > >> client > >> > > to > >> > > > brokers(producer to brokers) and provide latency information. > >> > > > > >> > > > On Tue, Jul 14, 2015 at 11:05 AM, Yuheng Du < > >> yuheng.du.h...@gmail.com> > >> > > > wrote: > >> > > > > >> > > > > Currently, the latency test from kafka test the end to end > latency > >> > > > between > >> > > > > producers and consumers. > >> > > > > > >> > > > > Is there a way to test the producer to broker and broker to > >> > consumer > >> > > > > delay seperately? > >> > > > > > >> > > > > Thanks. > >> > > > > > >> > > > > >> > > > >> > > >> > > > > >
Re: kafka benchmark tests
Hi Geoffrey, Thank you for your detailed explaining. They are really helpful. I am thinking of going after the second way, since I have bare metal access to all the nodes in the cluster, it's probably better to run real slave machines instead of virtual machines. (correct me if I am wrong) Each of my node has 256 G ram and 2T disk space, how large will the slave machine virtual machine be and how much memory they will take? Thank you! best, Yuheng On Wed, Jul 15, 2015 at 4:19 PM, Geoffrey Anderson wrote: > Hi Yuheng, > > Yes, you should be able to run on either mac or linux. > > The test cluster consists of a test-driver machine and some number of slave > machines. Right now, there are roughly two ways to set up the slave > machines: > > 1) Slave machines are virtual machines *on* the test-driver machine. > 2) Slave machines are external to the test-driver machine. > > 1 is the simplest to set up, but yes it does require installation of the > virtual machines on the test-driver machine. > > The installation of these machines is outlined in the quickstart I > mentioned (here is a better link for the test README: > https://github.com/confluentinc/kafka/tree/KAFKA-2276/tests). > > The tool we're using to bring up the slave virtual machines is called > vagrant, so the "vagrant" steps in the quickstart are really telling you > how to install the virtual machines. > > Hope that helps! > > Cheers, > Geoff > > > > > On Wed, Jul 15, 2015 at 12:13 PM, Yuheng Du > wrote: > > > Hi Geoffrey, > > > > Thank you for your helpful information. Do I have to install the virtual > > machines? I am using Mac as the testdriver machine or I can use a linux > > machine to run testdriver too. > > > > Thanks. > > > > best, > > Yuheng > > > > On Wed, Jul 15, 2015 at 2:55 PM, Geoffrey Anderson > > wrote: > > > > > Hi Yuheng, > > > > > > Running these tests requires a tool we've created at Confluent called > > > 'ducktape', which you need to install with the command: > > > pip install ducktape==0.2.0 > > > > > > Running the tests locally requires some setup (creation of virtual > > machines > > > etc.) which is outlined here: > > > > > > > > > https://github.com/apache/kafka/pull/70/files#diff-62f0ff60ede3b78b9c95624e2f61d6c1 > > > The instructions in the quickstart show you how to run the tests on > > cluster > > > of virtual machines (on a single host) > > > > > > Once you have a cluster up and running, you'll be able to run the test > > > you're interested in: > > > cd kafka/tests > > > ducktape kafkatest/tests/benchmark_test.py > > > > > > Definitely keep us posted about which parts are difficult, annoying, or > > > confusing about this process and we'll do our best to help. > > > > > > Thanks, > > > Geoff > > > > > > > > > > > > On Wed, Jul 15, 2015 at 12:49 AM, Yuheng Du > > > wrote: > > > > > > > Jiefu, > > > > > > > > Have you tried to run benchmark_test.py? I ran it and it asks me for > > the > > > > ducktape.services.service > > > > > > > > yuhengdu@consumer0:/packages/kafka_2.10-0.8.2.1$ python > > > benchmark_test.py > > > > > > > > Traceback (most recent call last): > > > > > > > > File "benchmark_test.py", line 16, in > > > > > > > > from ducktape.services.service import Service > > > > > > > > ImportError: No module named ducktape.services.service > > > > > > > > > > > > Can you help me on getting it to work, Ewen? Thanks. > > > > > > > > > > > > best, > > > > > > > > Yuheng > > > > > > > > On Tue, Jul 14, 2015 at 11:28 PM, Ewen Cheslack-Postava < > > > e...@confluent.io > > > > > > > > > wrote: > > > > > > > > > @Jiefu, yes! The patch is functional, I think it's just waiting on > a > > > bit > > > > of > > > > > final review after the last round of changes. You can definitely > use > > it > > > > for > > > > > your own benchmarking, and we'd love to see patches for any > > additional > > > > > tests we missed in the first pass! > > > > > > > > > > -Ewen > > > > > > >
Re: kafka TestEndtoEndLatency
Thanks. Here is the source code snippet of EndtoEndLatency test: for (i<- 0 until numMessages) { val begin = System.nanoTime producer.send( new ProducerRecord[Array[Byte],Array[Byte]](topic, message)) val received = iter.next val elapsed = System.nanoTime - begin // poor man's progress bar if (i % 1000 == 0) println(i + "\t" + elapsed / 1000.0 / 1000.0) totalTime + = elapsed latencies(i) = (elapsed / 1000 / 1000) } iter.next reads a message using a consumer process. The producer.send()function, is it blocking? I think the message should still matters here, am I right? In the source code, it is "val message = "hello there beautiful".getBytes" is around 21 bytes. I see that in this EndtoEndLatency test, no acks is involved, right? Thanks. Yuheng On Wed, Jul 15, 2015 at 3:43 PM, Guozhang Wang wrote: > The end-to-end latency record the transferring of a message from producer > to broker, then to consumer. > > I cannot remember the details not but I think the EndtoEndLatency test > record the latency as average, hence it is small. > > Guozhang > > On Wed, Jul 15, 2015 at 12:28 PM, Yuheng Du > wrote: > > > Guozhang, > > > > Thank you for explaining. I see that in ProducerPerformance call back > > functions were used to get the latency metrics. > > For the TestEndtoEndLatency, does message size matter? What this > end-to-end > > latency comprise of, besides transferring a package from source to > > destination (typically around 0.2 ms for 100bytes message)? > > > > Thanks. > > > > Yuheng > > > > On Wed, Jul 15, 2015 at 3:01 PM, Guozhang Wang > wrote: > > > > > Yuheng, > > > > > > Only TestEndtoEndLatency's number are end to end, for > ProducerPerformance > > > the latency is for the send-to-ack latency, which increases as batch > size > > > increases. > > > > > > Guozhang > > > > > > On Wed, Jul 15, 2015 at 11:36 AM, Yuheng Du > > > wrote: > > > > > > > In kafka performance tests https://gist.github.com/jkreps > > > > /c7ddb4041ef62a900e6c > > > > > > > > The TestEndtoEndLatency results are typically around 2ms, while the > > > > ProducerPerformance normally has "average latency"around several > > hundres > > > ms > > > > when using batch size 8196. > > > > > > > > Are both results talking about end to end latency? and the > > > > ProducerPerformance results' latency is just larger because of > batching > > > of > > > > several messages? > > > > > > > > What is the message size of the TestEndtoEndLatency test uses? > > > > > > > > Thanks for any ideas. > > > > > > > > best, > > > > > > > > > > > > > > > > -- > > > -- Guozhang > > > > > > > > > -- > -- Guozhang >
Re: kafka TestEndtoEndLatency
Guozhang, Thank you for explaining. I see that in ProducerPerformance call back functions were used to get the latency metrics. For the TestEndtoEndLatency, does message size matter? What this end-to-end latency comprise of, besides transferring a package from source to destination (typically around 0.2 ms for 100bytes message)? Thanks. Yuheng On Wed, Jul 15, 2015 at 3:01 PM, Guozhang Wang wrote: > Yuheng, > > Only TestEndtoEndLatency's number are end to end, for ProducerPerformance > the latency is for the send-to-ack latency, which increases as batch size > increases. > > Guozhang > > On Wed, Jul 15, 2015 at 11:36 AM, Yuheng Du > wrote: > > > In kafka performance tests https://gist.github.com/jkreps > > /c7ddb4041ef62a900e6c > > > > The TestEndtoEndLatency results are typically around 2ms, while the > > ProducerPerformance normally has "average latency"around several hundres > ms > > when using batch size 8196. > > > > Are both results talking about end to end latency? and the > > ProducerPerformance results' latency is just larger because of batching > of > > several messages? > > > > What is the message size of the TestEndtoEndLatency test uses? > > > > Thanks for any ideas. > > > > best, > > > > > > -- > -- Guozhang >
Re: kafka benchmark tests
Hi Geoffrey, Thank you for your helpful information. Do I have to install the virtual machines? I am using Mac as the testdriver machine or I can use a linux machine to run testdriver too. Thanks. best, Yuheng On Wed, Jul 15, 2015 at 2:55 PM, Geoffrey Anderson wrote: > Hi Yuheng, > > Running these tests requires a tool we've created at Confluent called > 'ducktape', which you need to install with the command: > pip install ducktape==0.2.0 > > Running the tests locally requires some setup (creation of virtual machines > etc.) which is outlined here: > > https://github.com/apache/kafka/pull/70/files#diff-62f0ff60ede3b78b9c95624e2f61d6c1 > The instructions in the quickstart show you how to run the tests on cluster > of virtual machines (on a single host) > > Once you have a cluster up and running, you'll be able to run the test > you're interested in: > cd kafka/tests > ducktape kafkatest/tests/benchmark_test.py > > Definitely keep us posted about which parts are difficult, annoying, or > confusing about this process and we'll do our best to help. > > Thanks, > Geoff > > > > On Wed, Jul 15, 2015 at 12:49 AM, Yuheng Du > wrote: > > > Jiefu, > > > > Have you tried to run benchmark_test.py? I ran it and it asks me for the > > ducktape.services.service > > > > yuhengdu@consumer0:/packages/kafka_2.10-0.8.2.1$ python > benchmark_test.py > > > > Traceback (most recent call last): > > > > File "benchmark_test.py", line 16, in > > > > from ducktape.services.service import Service > > > > ImportError: No module named ducktape.services.service > > > > > > Can you help me on getting it to work, Ewen? Thanks. > > > > > > best, > > > > Yuheng > > > > On Tue, Jul 14, 2015 at 11:28 PM, Ewen Cheslack-Postava < > e...@confluent.io > > > > > wrote: > > > > > @Jiefu, yes! The patch is functional, I think it's just waiting on a > bit > > of > > > final review after the last round of changes. You can definitely use it > > for > > > your own benchmarking, and we'd love to see patches for any additional > > > tests we missed in the first pass! > > > > > > -Ewen > > > > > > On Tue, Jul 14, 2015 at 10:53 AM, JIEFU GONG > wrote: > > > > > > > Yuheng, > > > > I would recommend looking here: > > > > http://kafka.apache.org/documentation.html#brokerconfigs and > scrolling > > > > down > > > > to get a better understanding of the default settings and what they > > mean > > > -- > > > > it'll tell you what different options for acks does. > > > > > > > > Ewen, > > > > Thank you immensely for your thoughts, they shed a lot of insight > into > > > the > > > > issue. Though it is understandable that your specific results need to > > be > > > > verified, it seems that the KIP-25 patch is functional and I can use > it > > > for > > > > my own benchmarking purposes? Is that correct? Thanks again! > > > > > > > > On Tue, Jul 14, 2015 at 8:22 AM, Yuheng Du > > > > > wrote: > > > > > > > > > Also, I guess setting the target throughput to -1 means let it be > as > > > high > > > > > as possible? > > > > > > > > > > On Tue, Jul 14, 2015 at 10:36 AM, Yuheng Du < > > yuheng.du.h...@gmail.com> > > > > > wrote: > > > > > > > > > > > Thanks. If I set the acks=1 in the producer config options in > > > > > > bin/kafka-run-class.sh > > > > org.apache.kafka.clients.tools.ProducerPerformance > > > > > > test7 50000000 100 -1 acks=1 bootstrap.servers= > > > > > > esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 > > > > > batch.size=8196? > > > > > > > > > > > > Does that mean for each message generated at the producer, the > > > producer > > > > > > will wait until the broker sends the ack back, then send another > > > > message? > > > > > > > > > > > > Thanks. > > > > > > > > > > > > Yuheng > > > > > > > > > > > > On Tue, Jul 14, 2015 at 10:06 AM, Manikumar Reddy < > > > > ku...@nmsworks.co.in> > > > > > > wrote: > > > > > > > > > > >
kafka TestEndtoEndLatency
In kafka performance tests https://gist.github.com/jkreps /c7ddb4041ef62a900e6c The TestEndtoEndLatency results are typically around 2ms, while the ProducerPerformance normally has "average latency"around several hundres ms when using batch size 8196. Are both results talking about end to end latency? and the ProducerPerformance results' latency is just larger because of batching of several messages? What is the message size of the TestEndtoEndLatency test uses? Thanks for any ideas. best,
Re: Latency test
I have run the end to end latency test and the producerPerformance test on my kafka cluster according to https://gist.github.com/jkreps/c7ddb4041ef62a900e6c In end to end latency test, the latency was around 2ms. In producerperformance test, if use batch size 8196 to send 50,000,000 records: >bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance speedx1 5000 100 -1 acks=1 bootstrap.servers=192.168.1.1:9092 buffer.memory=67108864 batch.size=8196 The results show that max latency is 3617ms, avg latency 626.7ms. I wanna know why the latency in producerperformance test is significantly larger than end to end test? Is it because of batching? Are the definitons of these two latencies different? I looked at the source code and I believe the latency is measure for the producer.send() function to complete. So does this latency includes transmission delay, transferring delay, and what other components? Thanks. best, Yuheng On Wed, Jul 15, 2015 at 3:51 AM, Yuheng Du wrote: > Tao, > > If I am running on the command line the following command > >bin/kafka-run-class.sh kafka.tools.TestEndToEndLatency 192.168.1.3:9092 > 192.168.1.1:2181 speedx3 5000 100 1 -Xmx1024m > > It promped that it is not correct. So where should I put the -Xmx1024m > option? Thanks. > > > > On Wed, Jul 15, 2015 at 3:44 AM, Tao Feng wrote: > >> (Please correct me if I am wrong.) Based on TestEndToEndLatency( >> >> https://github.com/apache/kafka/blob/trunk/core/src/test/scala/kafka/tools/TestEndToEndLatency.scala >> ), >> consumer_fetch_max_wait corresponds to fetch.wait.max.ms in consumer >> config( >> http://kafka.apache.org/documentation.html#consumerconfigs). The default >> value listed at document is 100(ms). >> >> To add java heap space to jvm, put -Xmx$Size(max heap size) for your jvm >> option. >> >> On Wed, Jul 15, 2015 at 12:29 AM, Yuheng Du >> wrote: >> >> > Tao, >> > >> > Thanks. The example on >> https://gist.github.com/jkreps/c7ddb4041ef62a900e6c >> > is outdated already. The error message shows: >> > >> > USAGE: java kafka.tools.TestEndToEndLatency$ broker_list >> zookeeper_connect >> > topic num_messages consumer_fetch_max_wait producer_acks >> > >> > >> > Can anyone helps me what should be put in consumer_fetch_max_wait? >> Thanks. >> > >> > On Tue, Jul 14, 2015 at 5:21 PM, Tao Feng wrote: >> > >> > > I think ProducerPerformance microbenchmark only measure between >> client >> > to >> > > brokers(producer to brokers) and provide latency information. >> > > >> > > On Tue, Jul 14, 2015 at 11:05 AM, Yuheng Du > > >> > > wrote: >> > > >> > > > Currently, the latency test from kafka test the end to end latency >> > > between >> > > > producers and consumers. >> > > > >> > > > Is there a way to test the producer to broker and broker to >> consumer >> > > > delay seperately? >> > > > >> > > > Thanks. >> > > > >> > > >> > >> > >
latency performance test
Hi, I have run the end to end latency test and the producerPerformance test on my kafka cluster according to https://gist.github.com/jkreps/c7ddb4041ef62a900e6c In end to end latency test, the latency was around 2ms. In producerperformance test, if use batch size 8196 to send 50,000,000 records: >bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance speedx1 5000 100 -1 acks=1 bootstrap.servers=192.168.1.1:9092 buffer.memory=67108864 batch.size=8196 The results show that max latency is 3617ms, avg latency 626.7ms. I wanna know why the latency in producerperformance test is significantly larger than end to end test? Is it because of batching? Are the definitons of these two latencies different? I looked at the source code and I believe the latency is measure for the producer.send() function to complete. So does this latency includes transmission delay, transferring delay, and what other components? Thanks. best, Yuheng
Re: kafka benchmark tests
Jiefu, Have you tried to run benchmark_test.py? I ran it and it asks me for the ducktape.services.service yuhengdu@consumer0:/packages/kafka_2.10-0.8.2.1$ python benchmark_test.py Traceback (most recent call last): File "benchmark_test.py", line 16, in from ducktape.services.service import Service ImportError: No module named ducktape.services.service Can you help me on getting it to work, Ewen? Thanks. best, Yuheng On Tue, Jul 14, 2015 at 11:28 PM, Ewen Cheslack-Postava wrote: > @Jiefu, yes! The patch is functional, I think it's just waiting on a bit of > final review after the last round of changes. You can definitely use it for > your own benchmarking, and we'd love to see patches for any additional > tests we missed in the first pass! > > -Ewen > > On Tue, Jul 14, 2015 at 10:53 AM, JIEFU GONG wrote: > > > Yuheng, > > I would recommend looking here: > > http://kafka.apache.org/documentation.html#brokerconfigs and scrolling > > down > > to get a better understanding of the default settings and what they mean > -- > > it'll tell you what different options for acks does. > > > > Ewen, > > Thank you immensely for your thoughts, they shed a lot of insight into > the > > issue. Though it is understandable that your specific results need to be > > verified, it seems that the KIP-25 patch is functional and I can use it > for > > my own benchmarking purposes? Is that correct? Thanks again! > > > > On Tue, Jul 14, 2015 at 8:22 AM, Yuheng Du > > wrote: > > > > > Also, I guess setting the target throughput to -1 means let it be as > high > > > as possible? > > > > > > On Tue, Jul 14, 2015 at 10:36 AM, Yuheng Du > > > wrote: > > > > > > > Thanks. If I set the acks=1 in the producer config options in > > > > bin/kafka-run-class.sh > > org.apache.kafka.clients.tools.ProducerPerformance > > > > test7 5000 100 -1 acks=1 bootstrap.servers= > > > > esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 > > > batch.size=8196? > > > > > > > > Does that mean for each message generated at the producer, the > producer > > > > will wait until the broker sends the ack back, then send another > > message? > > > > > > > > Thanks. > > > > > > > > Yuheng > > > > > > > > On Tue, Jul 14, 2015 at 10:06 AM, Manikumar Reddy < > > ku...@nmsworks.co.in> > > > > wrote: > > > > > > > >> Yes, A list of Kafka Server host/port pairs to use for establishing > > the > > > >> initial connection to the Kafka cluster > > > >> > > > >> https://kafka.apache.org/documentation.html#newproducerconfigs > > > >> > > > >> On Tue, Jul 14, 2015 at 7:29 PM, Yuheng Du < > yuheng.du.h...@gmail.com> > > > >> wrote: > > > >> > > > >> > Does anyone know what is bootstrap.servers= > > > >> > esv4-hcl198.grid.linkedin.com:9092 means in the following test > > > command: > > > >> > > > > >> > bin/kafka-run-class.sh > > > >> org.apache.kafka.clients.tools.ProducerPerformance > > > >> > test7 5000 100 -1 acks=1 bootstrap.servers= > > > >> > esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 > > > >> batch.size=8196? > > > >> > > > > >> > what is bootstrap.servers? Is it the kafka server that I am > running > > a > > > >> test > > > >> > at? > > > >> > > > > >> > Thanks. > > > >> > > > > >> > Yuheng > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > On Tue, Jul 14, 2015 at 12:18 AM, Ewen Cheslack-Postava < > > > >> e...@confluent.io > > > >> > > > > > >> > wrote: > > > >> > > > > >> > > I implemented (nearly) the same basic set of tests in the system > > > test > > > >> > > framework we started at Confluent and that is going to move into > > > >> Kafka -- > > > >> > > see the wip patch for KIP-25 here: > > > >> > https://github.com/apache/kafka/pull/70 > > > >> > > In particular, that test is implemented in benchmark_test.py: > > > >> > > > > > >> > &g
Re: Latency test
Tao, If I am running on the command line the following command >bin/kafka-run-class.sh kafka.tools.TestEndToEndLatency 192.168.1.3:9092 192.168.1.1:2181 speedx3 5000 100 1 -Xmx1024m It promped that it is not correct. So where should I put the -Xmx1024m option? Thanks. On Wed, Jul 15, 2015 at 3:44 AM, Tao Feng wrote: > (Please correct me if I am wrong.) Based on TestEndToEndLatency( > > https://github.com/apache/kafka/blob/trunk/core/src/test/scala/kafka/tools/TestEndToEndLatency.scala > ), > consumer_fetch_max_wait corresponds to fetch.wait.max.ms in consumer > config( > http://kafka.apache.org/documentation.html#consumerconfigs). The default > value listed at document is 100(ms). > > To add java heap space to jvm, put -Xmx$Size(max heap size) for your jvm > option. > > On Wed, Jul 15, 2015 at 12:29 AM, Yuheng Du > wrote: > > > Tao, > > > > Thanks. The example on > https://gist.github.com/jkreps/c7ddb4041ef62a900e6c > > is outdated already. The error message shows: > > > > USAGE: java kafka.tools.TestEndToEndLatency$ broker_list > zookeeper_connect > > topic num_messages consumer_fetch_max_wait producer_acks > > > > > > Can anyone helps me what should be put in consumer_fetch_max_wait? > Thanks. > > > > On Tue, Jul 14, 2015 at 5:21 PM, Tao Feng wrote: > > > > > I think ProducerPerformance microbenchmark only measure between client > > to > > > brokers(producer to brokers) and provide latency information. > > > > > > On Tue, Jul 14, 2015 at 11:05 AM, Yuheng Du > > > wrote: > > > > > > > Currently, the latency test from kafka test the end to end latency > > > between > > > > producers and consumers. > > > > > > > > Is there a way to test the producer to broker and broker to > consumer > > > > delay seperately? > > > > > > > > Thanks. > > > > > > > > > >
Re: Latency test
I got java out of heap error when running end to end latency test: yuhengdu@consumer0:/packages/kafka_2.10-0.8.2.1$ bin/kafka-run-class.sh kafka.tools.TestEndToEndLatency 192.168.1.3:9092 192.168.1.1:2181 speedx3 5000 100 1 Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at kafka.tools.TestEndToEndLatency$.main(TestEndToEndLatency.scala:69) at kafka.tools.TestEndToEndLatency.main(TestEndToEndLatency.scala) What command should I do to add java heap space to jvm? Thanks! Yuheng On Wed, Jul 15, 2015 at 3:29 AM, Yuheng Du wrote: > Tao, > > Thanks. The example on https://gist.github.com/jkreps/c7ddb4041ef62a900e6c > is outdated already. The error message shows: > > USAGE: java kafka.tools.TestEndToEndLatency$ broker_list zookeeper_connect > topic num_messages consumer_fetch_max_wait producer_acks > > > Can anyone helps me what should be put in consumer_fetch_max_wait? Thanks. > > On Tue, Jul 14, 2015 at 5:21 PM, Tao Feng wrote: > >> I think ProducerPerformance microbenchmark only measure between client to >> brokers(producer to brokers) and provide latency information. >> >> On Tue, Jul 14, 2015 at 11:05 AM, Yuheng Du >> wrote: >> >> > Currently, the latency test from kafka test the end to end latency >> between >> > producers and consumers. >> > >> > Is there a way to test the producer to broker and broker to consumer >> > delay seperately? >> > >> > Thanks. >> > >> > >
Re: Latency test
Tao, Thanks. The example on https://gist.github.com/jkreps/c7ddb4041ef62a900e6c is outdated already. The error message shows: USAGE: java kafka.tools.TestEndToEndLatency$ broker_list zookeeper_connect topic num_messages consumer_fetch_max_wait producer_acks Can anyone helps me what should be put in consumer_fetch_max_wait? Thanks. On Tue, Jul 14, 2015 at 5:21 PM, Tao Feng wrote: > I think ProducerPerformance microbenchmark only measure between client to > brokers(producer to brokers) and provide latency information. > > On Tue, Jul 14, 2015 at 11:05 AM, Yuheng Du > wrote: > > > Currently, the latency test from kafka test the end to end latency > between > > producers and consumers. > > > > Is there a way to test the producer to broker and broker to consumer > > delay seperately? > > > > Thanks. > > >
Re: How to run the three producers test
Jiefu, Now even if the disk space is enough (less than 18%), when I run it still gives me error where in the logs it says: [2015-07-14 23:08:48,735] FATAL Fatal error during KafkaServerStartable startup. Prepare to shutdown (kafka.server.KafkaServerStartable) org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 6000 at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:880) at org.I0Itec.zkclient.ZkClient.(ZkClient.java:98) at org.I0Itec.zkclient.ZkClient.(ZkClient.java:84) at kafka.server.KafkaServer.initZk(KafkaServer.scala:157) at kafka.server.KafkaServer.startup(KafkaServer.scala:82) at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:29) at kafka.Kafka$.main(Kafka.scala:46) at kafka.Kafka.main(Kafka.scala) [2015-07-14 23:08:48,737] INFO [Kafka Server 1], shutting down (kafka.server.KafkaServer) I have checked that the zookeeper is running fine. Can anyone help why I got the error? Thanks. On Tue, Jul 14, 2015 at 10:24 PM, Yuheng Du wrote: > But is there a way to let kafka override the old data if the disk is > filled? Or is it not necessary to use this figure? Thanks. > > On Tue, Jul 14, 2015 at 10:14 PM, Yuheng Du > wrote: > >> Jiefu, >> >> I agree with you. I checked the hardware specs of my machines, each one >> of them has: >> >> RAM >> >> >> >> 256GB ECC Memory (16x 16 GB DDR4 1600MT/s dual rank RDIMMs >> >> Disk >> >> >> >> Two 1 TB 7.2K RPM 3G SATA HDDs >> >> For the throughput versus stored data test, it uses 5*10^10 messages, >> which has the total volume of 5TB, I made the replication factor to be 3, >> which means the total size including replicas would be 15TB, which >> apparently overwhelmed the two brokers I use. >> >> Thanks. >> >> best, >> Yuheng >> >> On Tue, Jul 14, 2015 at 6:06 PM, JIEFU GONG wrote: >> >>> Someone may correct me if I am incorrect, but how much disk space do you >>> have on these nodes? Your exception 'No space left on device' from one of >>> your brokers seems to suggest that you're full (after all you're writing >>> 500 million records). If this is the case I believe the expected behavior >>> for Kafka is to reject any more attempts to write data? >>> >>> On Tue, Jul 14, 2015 at 2:27 PM, Yuheng Du >>> wrote: >>> >>> > Also, the log in another broker (not the bootstrap) says: >>> > >>> > [2015-07-14 15:18:41,220] FATAL [Replica Manager on Broker 1]: Error >>> > writing to highwatermark file: (kafka.server.ReplicaManager) >>> > >>> > >>> > [2015-07-14 15:18:40,160] ERROR Closing socket for /130.127.133.47 >>> because >>> > of error (kafka.network.Process >>> > >>> > or) >>> > >>> > java.io.IOException: Connection reset by peer >>> > >>> > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) >>> > >>> > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) >>> > >>> > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) >>> > >>> > at sun.nio.ch.IOUtil.read(IOUtil.java:197) >>> > >>> > at >>> sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) >>> > >>> > at kafka.utils.Utils$.read(Utils.scala:380) >>> > >>> > at >>> > >>> > >>> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54) >>> > >>> > at kafka.network.Processor.read(SocketServer.scala:444) >>> > >>> > at kafka.network.Processor.run(SocketServer.scala:340) >>> > >>> > at java.lang.Thread.run(Thread.java:745) >>> > >>> > >>> > >>> > java.io.IOException: No space left on device >>> > >>> > at java.io.FileOutputStream.writeBytes(Native Method) >>> > >>> > at java.io.FileOutputStream.write(FileOutputStream.java:345) >>> > >>> > at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221) >>> > >>> > at sun.nio.cs.StreamEncoder.implFlushBuffe >>> > >>> > (END) >>> > >>> > On Tue, Jul 14, 2015 at 5:24 PM, Yuheng Du >>> > wrote: >>> > >>
Re: How to run the three producers test
But is there a way to let kafka override the old data if the disk is filled? Or is it not necessary to use this figure? Thanks. On Tue, Jul 14, 2015 at 10:14 PM, Yuheng Du wrote: > Jiefu, > > I agree with you. I checked the hardware specs of my machines, each one of > them has: > > RAM > > > > 256GB ECC Memory (16x 16 GB DDR4 1600MT/s dual rank RDIMMs > > Disk > > > > Two 1 TB 7.2K RPM 3G SATA HDDs > > For the throughput versus stored data test, it uses 5*10^10 messages, > which has the total volume of 5TB, I made the replication factor to be 3, > which means the total size including replicas would be 15TB, which > apparently overwhelmed the two brokers I use. > > Thanks. > > best, > Yuheng > > On Tue, Jul 14, 2015 at 6:06 PM, JIEFU GONG wrote: > >> Someone may correct me if I am incorrect, but how much disk space do you >> have on these nodes? Your exception 'No space left on device' from one of >> your brokers seems to suggest that you're full (after all you're writing >> 500 million records). If this is the case I believe the expected behavior >> for Kafka is to reject any more attempts to write data? >> >> On Tue, Jul 14, 2015 at 2:27 PM, Yuheng Du >> wrote: >> >> > Also, the log in another broker (not the bootstrap) says: >> > >> > [2015-07-14 15:18:41,220] FATAL [Replica Manager on Broker 1]: Error >> > writing to highwatermark file: (kafka.server.ReplicaManager) >> > >> > >> > [2015-07-14 15:18:40,160] ERROR Closing socket for /130.127.133.47 >> because >> > of error (kafka.network.Process >> > >> > or) >> > >> > java.io.IOException: Connection reset by peer >> > >> > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) >> > >> > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) >> > >> > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) >> > >> > at sun.nio.ch.IOUtil.read(IOUtil.java:197) >> > >> > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) >> > >> > at kafka.utils.Utils$.read(Utils.scala:380) >> > >> > at >> > >> > >> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54) >> > >> > at kafka.network.Processor.read(SocketServer.scala:444) >> > >> > at kafka.network.Processor.run(SocketServer.scala:340) >> > >> > at java.lang.Thread.run(Thread.java:745) >> > >> > >> > >> > java.io.IOException: No space left on device >> > >> > at java.io.FileOutputStream.writeBytes(Native Method) >> > >> > at java.io.FileOutputStream.write(FileOutputStream.java:345) >> > >> > at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221) >> > >> > at sun.nio.cs.StreamEncoder.implFlushBuffe >> > >> > (END) >> > >> > On Tue, Jul 14, 2015 at 5:24 PM, Yuheng Du >> > wrote: >> > >> > > Hi Jiefu, Gwen, >> > > >> > > I am running the Throughput versus stored data test: >> > > bin/kafka-run-class.sh >> org.apache.kafka.clients.tools.ProducerPerformance >> > > test 500 100 -1 acks=1 bootstrap.servers= >> > > esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 >> > batch.size=8196 >> > > >> > > After around 50,000,000 messages were sent, I got a bunch of >> connection >> > > refused error as I mentioned before. I checked the logs on the broker >> and >> > > here is what I see: >> > > >> > > [2015-07-14 15:11:23,578] WARN Partition [test,4] on broker 5: No >> > > checkpointed highwatermark is found for partition [test,4] >> > > (kafka.cluster.Partition) >> > > >> > > [2015-07-14 15:12:33,298] INFO Rolled new log segment for 'test-4' in >> 4 >> > > ms. (kafka.log.Log) >> > > >> > > [2015-07-14 15:12:33,299] INFO Rolled new log segment for 'test-0' in >> 1 >> > > ms. (kafka.log.Log) >> > > >> > > [2015-07-14 15:13:39,529] INFO Rolled new log segment for 'test-4' in >> 1 >> > > ms. (kafka.log.Log) >> > > >> > > [2015-07-14 15:13:39,531] INFO Rolled new log segment for 'test-0' in >> 1 >> > > ms.
Re: How to run the three producers test
Jiefu, I agree with you. I checked the hardware specs of my machines, each one of them has: RAM 256GB ECC Memory (16x 16 GB DDR4 1600MT/s dual rank RDIMMs Disk Two 1 TB 7.2K RPM 3G SATA HDDs For the throughput versus stored data test, it uses 5*10^10 messages, which has the total volume of 5TB, I made the replication factor to be 3, which means the total size including replicas would be 15TB, which apparently overwhelmed the two brokers I use. Thanks. best, Yuheng On Tue, Jul 14, 2015 at 6:06 PM, JIEFU GONG wrote: > Someone may correct me if I am incorrect, but how much disk space do you > have on these nodes? Your exception 'No space left on device' from one of > your brokers seems to suggest that you're full (after all you're writing > 500 million records). If this is the case I believe the expected behavior > for Kafka is to reject any more attempts to write data? > > On Tue, Jul 14, 2015 at 2:27 PM, Yuheng Du > wrote: > > > Also, the log in another broker (not the bootstrap) says: > > > > [2015-07-14 15:18:41,220] FATAL [Replica Manager on Broker 1]: Error > > writing to highwatermark file: (kafka.server.ReplicaManager) > > > > > > [2015-07-14 15:18:40,160] ERROR Closing socket for /130.127.133.47 > because > > of error (kafka.network.Process > > > > or) > > > > java.io.IOException: Connection reset by peer > > > > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) > > > > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > > > > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) > > > > at sun.nio.ch.IOUtil.read(IOUtil.java:197) > > > > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) > > > > at kafka.utils.Utils$.read(Utils.scala:380) > > > > at > > > > > kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54) > > > > at kafka.network.Processor.read(SocketServer.scala:444) > > > > at kafka.network.Processor.run(SocketServer.scala:340) > > > > at java.lang.Thread.run(Thread.java:745) > > > > > > > > java.io.IOException: No space left on device > > > > at java.io.FileOutputStream.writeBytes(Native Method) > > > > at java.io.FileOutputStream.write(FileOutputStream.java:345) > > > > at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221) > > > > at sun.nio.cs.StreamEncoder.implFlushBuffe > > > > (END) > > > > On Tue, Jul 14, 2015 at 5:24 PM, Yuheng Du > > wrote: > > > > > Hi Jiefu, Gwen, > > > > > > I am running the Throughput versus stored data test: > > > bin/kafka-run-class.sh > org.apache.kafka.clients.tools.ProducerPerformance > > > test 500 100 -1 acks=1 bootstrap.servers= > > > esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 > > batch.size=8196 > > > > > > After around 50,000,000 messages were sent, I got a bunch of connection > > > refused error as I mentioned before. I checked the logs on the broker > and > > > here is what I see: > > > > > > [2015-07-14 15:11:23,578] WARN Partition [test,4] on broker 5: No > > > checkpointed highwatermark is found for partition [test,4] > > > (kafka.cluster.Partition) > > > > > > [2015-07-14 15:12:33,298] INFO Rolled new log segment for 'test-4' in 4 > > > ms. (kafka.log.Log) > > > > > > [2015-07-14 15:12:33,299] INFO Rolled new log segment for 'test-0' in 1 > > > ms. (kafka.log.Log) > > > > > > [2015-07-14 15:13:39,529] INFO Rolled new log segment for 'test-4' in 1 > > > ms. (kafka.log.Log) > > > > > > [2015-07-14 15:13:39,531] INFO Rolled new log segment for 'test-0' in 1 > > > ms. (kafka.log.Log) > > > > > > [2015-07-14 15:14:48,502] INFO Rolled new log segment for 'test-4' in 3 > > > ms. (kafka.log.Log) > > > > > > [2015-07-14 15:14:48,502] INFO Rolled new log segment for 'test-0' in 1 > > > ms. (kafka.log.Log) > > > > > > [2015-07-14 15:15:51,478] INFO Rolled new log segment for 'test-4' in 1 > > > ms. (kafka.log.Log) > > > > > > [2015-07-14 15:15:51,479] INFO Rolled new log segment for 'test-0' in 1 > > > ms. (kafka.log.Log) > > > > > > [2015-07-14 15:16:52,589] INFO Rolled new log segment for 'te
Re: How to run the three producers test
Also, the log in another broker (not the bootstrap) says: [2015-07-14 15:18:41,220] FATAL [Replica Manager on Broker 1]: Error writing to highwatermark file: (kafka.server.ReplicaManager) [2015-07-14 15:18:40,160] ERROR Closing socket for /130.127.133.47 because of error (kafka.network.Process or) java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) at sun.nio.ch.IOUtil.read(IOUtil.java:197) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) at kafka.utils.Utils$.read(Utils.scala:380) at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54) at kafka.network.Processor.read(SocketServer.scala:444) at kafka.network.Processor.run(SocketServer.scala:340) at java.lang.Thread.run(Thread.java:745) java.io.IOException: No space left on device at java.io.FileOutputStream.writeBytes(Native Method) at java.io.FileOutputStream.write(FileOutputStream.java:345) at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221) at sun.nio.cs.StreamEncoder.implFlushBuffe (END) On Tue, Jul 14, 2015 at 5:24 PM, Yuheng Du wrote: > Hi Jiefu, Gwen, > > I am running the Throughput versus stored data test: > bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance > test 500 100 -1 acks=1 bootstrap.servers= > esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196 > > After around 50,000,000 messages were sent, I got a bunch of connection > refused error as I mentioned before. I checked the logs on the broker and > here is what I see: > > [2015-07-14 15:11:23,578] WARN Partition [test,4] on broker 5: No > checkpointed highwatermark is found for partition [test,4] > (kafka.cluster.Partition) > > [2015-07-14 15:12:33,298] INFO Rolled new log segment for 'test-4' in 4 > ms. (kafka.log.Log) > > [2015-07-14 15:12:33,299] INFO Rolled new log segment for 'test-0' in 1 > ms. (kafka.log.Log) > > [2015-07-14 15:13:39,529] INFO Rolled new log segment for 'test-4' in 1 > ms. (kafka.log.Log) > > [2015-07-14 15:13:39,531] INFO Rolled new log segment for 'test-0' in 1 > ms. (kafka.log.Log) > > [2015-07-14 15:14:48,502] INFO Rolled new log segment for 'test-4' in 3 > ms. (kafka.log.Log) > > [2015-07-14 15:14:48,502] INFO Rolled new log segment for 'test-0' in 1 > ms. (kafka.log.Log) > > [2015-07-14 15:15:51,478] INFO Rolled new log segment for 'test-4' in 1 > ms. (kafka.log.Log) > > [2015-07-14 15:15:51,479] INFO Rolled new log segment for 'test-0' in 1 > ms. (kafka.log.Log) > > [2015-07-14 15:16:52,589] INFO Rolled new log segment for 'test-4' in 1 > ms. (kafka.log.Log) > > [2015-07-14 15:16:52,590] INFO Rolled new log segment for 'test-0' in 1 > ms. (kafka.log.Log) > > [2015-07-14 15:17:57,406] INFO Rolled new log segment for 'test-4' in 1 > ms. (kafka.log.Log) > > [2015-07-14 15:17:57,407] INFO Rolled new log segment for 'test-0' in 0 > ms. (kafka.log.Log) > > [2015-07-14 15:18:39,792] FATAL [KafkaApi-5] Halting due to unrecoverable > I/O error while handling produce request: (kafka.server.KafkaApis) > > kafka.common.KafkaStorageException: I/O exception in append to log 'test-0' > > at kafka.log.Log.append(Log.scala:266) > > at > kafka.cluster.Partition$$anonfun$appendMessagesToLeader$1.apply(Partition.scala:379) > > at > kafka.cluster.Partition$$anonfun$appendMessagesToLeader$1.apply(Partition.scala:365) > > at kafka.utils.Utils$.inLock(Utils.scala:535) > > at kafka.utils.Utils$.inReadLock(Utils.scala:541) > > at > kafka.cluster.Partition.appendMessagesToLeader(Partition.scala:365) > > at > kafka.server.KafkaApis$$anonfun$appendToLocalLog$2.apply(KafkaApis.scala:291) > > at > kafka.server.KafkaApis$$anonfun$appendToLocalLog$2.apply(KafkaApis.scala:282) > > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) > > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) > > at > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) > > at > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) > > at scala.coll > > > > Can you help me with this problem? Thanks. > > On Tue, Jul 14, 2015 at 5:12 PM, Yuheng Du > wrote: > >>
Re: How to run the three producers test
Hi Jiefu, Gwen, I am running the Throughput versus stored data test: bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test 500 100 -1 acks=1 bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196 After around 50,000,000 messages were sent, I got a bunch of connection refused error as I mentioned before. I checked the logs on the broker and here is what I see: [2015-07-14 15:11:23,578] WARN Partition [test,4] on broker 5: No checkpointed highwatermark is found for partition [test,4] (kafka.cluster.Partition) [2015-07-14 15:12:33,298] INFO Rolled new log segment for 'test-4' in 4 ms. (kafka.log.Log) [2015-07-14 15:12:33,299] INFO Rolled new log segment for 'test-0' in 1 ms. (kafka.log.Log) [2015-07-14 15:13:39,529] INFO Rolled new log segment for 'test-4' in 1 ms. (kafka.log.Log) [2015-07-14 15:13:39,531] INFO Rolled new log segment for 'test-0' in 1 ms. (kafka.log.Log) [2015-07-14 15:14:48,502] INFO Rolled new log segment for 'test-4' in 3 ms. (kafka.log.Log) [2015-07-14 15:14:48,502] INFO Rolled new log segment for 'test-0' in 1 ms. (kafka.log.Log) [2015-07-14 15:15:51,478] INFO Rolled new log segment for 'test-4' in 1 ms. (kafka.log.Log) [2015-07-14 15:15:51,479] INFO Rolled new log segment for 'test-0' in 1 ms. (kafka.log.Log) [2015-07-14 15:16:52,589] INFO Rolled new log segment for 'test-4' in 1 ms. (kafka.log.Log) [2015-07-14 15:16:52,590] INFO Rolled new log segment for 'test-0' in 1 ms. (kafka.log.Log) [2015-07-14 15:17:57,406] INFO Rolled new log segment for 'test-4' in 1 ms. (kafka.log.Log) [2015-07-14 15:17:57,407] INFO Rolled new log segment for 'test-0' in 0 ms. (kafka.log.Log) [2015-07-14 15:18:39,792] FATAL [KafkaApi-5] Halting due to unrecoverable I/O error while handling produce request: (kafka.server.KafkaApis) kafka.common.KafkaStorageException: I/O exception in append to log 'test-0' at kafka.log.Log.append(Log.scala:266) at kafka.cluster.Partition$$anonfun$appendMessagesToLeader$1.apply(Partition.scala:379) at kafka.cluster.Partition$$anonfun$appendMessagesToLeader$1.apply(Partition.scala:365) at kafka.utils.Utils$.inLock(Utils.scala:535) at kafka.utils.Utils$.inReadLock(Utils.scala:541) at kafka.cluster.Partition.appendMessagesToLeader(Partition.scala:365) at kafka.server.KafkaApis$$anonfun$appendToLocalLog$2.apply(KafkaApis.scala:291) at kafka.server.KafkaApis$$anonfun$appendToLocalLog$2.apply(KafkaApis.scala:282) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) at scala.coll Can you help me with this problem? Thanks. On Tue, Jul 14, 2015 at 5:12 PM, Yuheng Du wrote: > I checked the logs on the brokers, it seems that the zookeeper or the > kafka server process is not running on this broker...Thank you guys. I will > see if it happens again. > > On Tue, Jul 14, 2015 at 4:53 PM, JIEFU GONG wrote: > >> Hmm..yeah some error logs would be nice like Gwen pointed out. Do any of >> your brokers fall out of the ISR when sending messages? It seems like your >> setup should be fine, so I'm not entirely sure. >> >> On Tue, Jul 14, 2015 at 1:31 PM, Yuheng Du >> wrote: >> >> > Jiefu, >> > >> > I am performing these tests on a 6 nodes cluster in cloudlab (a >> > infrastructure built for scientific research). I use 2 nodes as >> producers, >> > 2 as brokers only, and 2 as consumers. I have tested for each individual >> > machines and they work well. I did not use AWS. Thank you! >> > >> > On Tue, Jul 14, 2015 at 4:20 PM, JIEFU GONG wrote: >> > >> > > Yuheng, are you performing these tests locally or using a service >> such as >> > > AWS? I'd try using each separate machine individually first, >> connecting >> > to >> > > the ZK/Kafka servers and ensuring that each is able to first log and >> > > consume messages independently. >> > > >> > > On Tue, Jul 14, 2015 at 1:17 PM, Gwen Shapira >> > > wrote: >> > > >> > > > Are there any errors on the broker logs? >> > > > >> > > > On Tue, Jul 14, 2015 at 11:57 AM, Yuheng Du < >> yuheng.du.h...@gmail.com> >> > > > wrote: >> > > > > Jiefu, >> > > > > >> > > >
Re: How to run the three producers test
I checked the logs on the brokers, it seems that the zookeeper or the kafka server process is not running on this broker...Thank you guys. I will see if it happens again. On Tue, Jul 14, 2015 at 4:53 PM, JIEFU GONG wrote: > Hmm..yeah some error logs would be nice like Gwen pointed out. Do any of > your brokers fall out of the ISR when sending messages? It seems like your > setup should be fine, so I'm not entirely sure. > > On Tue, Jul 14, 2015 at 1:31 PM, Yuheng Du > wrote: > > > Jiefu, > > > > I am performing these tests on a 6 nodes cluster in cloudlab (a > > infrastructure built for scientific research). I use 2 nodes as > producers, > > 2 as brokers only, and 2 as consumers. I have tested for each individual > > machines and they work well. I did not use AWS. Thank you! > > > > On Tue, Jul 14, 2015 at 4:20 PM, JIEFU GONG wrote: > > > > > Yuheng, are you performing these tests locally or using a service such > as > > > AWS? I'd try using each separate machine individually first, connecting > > to > > > the ZK/Kafka servers and ensuring that each is able to first log and > > > consume messages independently. > > > > > > On Tue, Jul 14, 2015 at 1:17 PM, Gwen Shapira > > > wrote: > > > > > > > Are there any errors on the broker logs? > > > > > > > > On Tue, Jul 14, 2015 at 11:57 AM, Yuheng Du < > yuheng.du.h...@gmail.com> > > > > wrote: > > > > > Jiefu, > > > > > > > > > > Thank you. The three producers can run at the same time. I mean > > should > > > > they > > > > > be started at exactly the same time? (I have three consoles from > each > > > of > > > > > the three machines and I just start the console command manually > one > > by > > > > > one) Or a small variation of the starting time won't matter? > > > > > > > > > > Gwen and Jiefu, > > > > > > > > > > I have started the three producers at three machines, after a > while, > > > all > > > > of > > > > > them gives a java.net.ConnectException: > > > > > > > > > > [2015-07-14 12:56:46,352] WARN Error in I/O with producer0-link-0/ > > > > > 192.168.1.1 (org.apache.kafka.common.network.Selector) > > > > > > > > > > java.net.ConnectException: Connection refused.. > > > > > > > > > > [2015-07-14 12:56:48,056] WARN Error in I/O with producer1-link-0/ > > > > > 192.168.1.2 (org.apache.kafka.common.network.Selector) > > > > > > > > > > java.net.ConnectException: Connection refused. > > > > > > > > > > What could be the cause? > > > > > > > > > > Thank you guys! > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Jul 14, 2015 at 2:47 PM, JIEFU GONG > > > wrote: > > > > > > > > > >> Yuheng, > > > > >> > > > > >> Yes, if you read the blog post it specifies that he's using three > > > > separate > > > > >> machines. There's no reason the producers cannot be started at the > > > same > > > > >> time, I believe. > > > > >> > > > > >> On Tue, Jul 14, 2015 at 11:42 AM, Yuheng Du < > > yuheng.du.h...@gmail.com > > > > > > > > >> wrote: > > > > >> > > > > >> > Hi, > > > > >> > > > > > >> > I am running the performance test for kafka. > > > > >> > https://gist.github.com/jkreps > > > > >> > /c7ddb4041ef62a900e6c > > > > >> > > > > > >> > For the "Three Producers, 3x async replication" scenario, the > > > command > > > > is > > > > >> > the same as one producer: > > > > >> > > > > > >> > bin/kafka-run-class.sh > > > > org.apache.kafka.clients.tools.ProducerPerformance > > > > >> > test 5000 100 -1 acks=1 > > > > >> > bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092 > > > > >> > buffer.memory=67108864 batch.size=8196 > > > > >> > > > > > >> > So How to I run the test for three producers? Do I just run them > > on > > > > three > > > > >> > separate servers at the same time? Will there be some error in > > this > > > > way > > > > >> > since the three producers can't be started at the same time? > > > > >> > > > > > >> > Thanks. > > > > >> > > > > > >> > best, > > > > >> > Yuheng > > > > >> > > > > > >> > > > > >> > > > > >> > > > > >> -- > > > > >> > > > > >> Jiefu Gong > > > > >> University of California, Berkeley | Class of 2017 > > > > >> B.A Computer Science | College of Letters and Sciences > > > > >> > > > > >> jg...@berkeley.edu | (925) 400-3427 > > > > >> > > > > > > > > > > > > > > > > -- > > > > > > Jiefu Gong > > > University of California, Berkeley | Class of 2017 > > > B.A Computer Science | College of Letters and Sciences > > > > > > jg...@berkeley.edu | (925) 400-3427 > > > > > > > > > -- > > Jiefu Gong > University of California, Berkeley | Class of 2017 > B.A Computer Science | College of Letters and Sciences > > jg...@berkeley.edu | (925) 400-3427 >
Re: How to run the three producers test
Jiefu, I am performing these tests on a 6 nodes cluster in cloudlab (a infrastructure built for scientific research). I use 2 nodes as producers, 2 as brokers only, and 2 as consumers. I have tested for each individual machines and they work well. I did not use AWS. Thank you! On Tue, Jul 14, 2015 at 4:20 PM, JIEFU GONG wrote: > Yuheng, are you performing these tests locally or using a service such as > AWS? I'd try using each separate machine individually first, connecting to > the ZK/Kafka servers and ensuring that each is able to first log and > consume messages independently. > > On Tue, Jul 14, 2015 at 1:17 PM, Gwen Shapira > wrote: > > > Are there any errors on the broker logs? > > > > On Tue, Jul 14, 2015 at 11:57 AM, Yuheng Du > > wrote: > > > Jiefu, > > > > > > Thank you. The three producers can run at the same time. I mean should > > they > > > be started at exactly the same time? (I have three consoles from each > of > > > the three machines and I just start the console command manually one by > > > one) Or a small variation of the starting time won't matter? > > > > > > Gwen and Jiefu, > > > > > > I have started the three producers at three machines, after a while, > all > > of > > > them gives a java.net.ConnectException: > > > > > > [2015-07-14 12:56:46,352] WARN Error in I/O with producer0-link-0/ > > > 192.168.1.1 (org.apache.kafka.common.network.Selector) > > > > > > java.net.ConnectException: Connection refused.. > > > > > > [2015-07-14 12:56:48,056] WARN Error in I/O with producer1-link-0/ > > > 192.168.1.2 (org.apache.kafka.common.network.Selector) > > > > > > java.net.ConnectException: Connection refused. > > > > > > What could be the cause? > > > > > > Thank you guys! > > > > > > > > > > > > > > > On Tue, Jul 14, 2015 at 2:47 PM, JIEFU GONG > wrote: > > > > > >> Yuheng, > > >> > > >> Yes, if you read the blog post it specifies that he's using three > > separate > > >> machines. There's no reason the producers cannot be started at the > same > > >> time, I believe. > > >> > > >> On Tue, Jul 14, 2015 at 11:42 AM, Yuheng Du > > > >> wrote: > > >> > > >> > Hi, > > >> > > > >> > I am running the performance test for kafka. > > >> > https://gist.github.com/jkreps > > >> > /c7ddb4041ef62a900e6c > > >> > > > >> > For the "Three Producers, 3x async replication" scenario, the > command > > is > > >> > the same as one producer: > > >> > > > >> > bin/kafka-run-class.sh > > org.apache.kafka.clients.tools.ProducerPerformance > > >> > test 5000 100 -1 acks=1 > > >> > bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092 > > >> > buffer.memory=67108864 batch.size=8196 > > >> > > > >> > So How to I run the test for three producers? Do I just run them on > > three > > >> > separate servers at the same time? Will there be some error in this > > way > > >> > since the three producers can't be started at the same time? > > >> > > > >> > Thanks. > > >> > > > >> > best, > > >> > Yuheng > > >> > > > >> > > >> > > >> > > >> -- > > >> > > >> Jiefu Gong > > >> University of California, Berkeley | Class of 2017 > > >> B.A Computer Science | College of Letters and Sciences > > >> > > >> jg...@berkeley.edu | (925) 400-3427 > > >> > > > > > > -- > > Jiefu Gong > University of California, Berkeley | Class of 2017 > B.A Computer Science | College of Letters and Sciences > > jg...@berkeley.edu | (925) 400-3427 >
Re: How to run the three producers test
Jiefu, Thank you. The three producers can run at the same time. I mean should they be started at exactly the same time? (I have three consoles from each of the three machines and I just start the console command manually one by one) Or a small variation of the starting time won't matter? Gwen and Jiefu, I have started the three producers at three machines, after a while, all of them gives a java.net.ConnectException: [2015-07-14 12:56:46,352] WARN Error in I/O with producer0-link-0/ 192.168.1.1 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused.. [2015-07-14 12:56:48,056] WARN Error in I/O with producer1-link-0/ 192.168.1.2 (org.apache.kafka.common.network.Selector) java.net.ConnectException: Connection refused. What could be the cause? Thank you guys! On Tue, Jul 14, 2015 at 2:47 PM, JIEFU GONG wrote: > Yuheng, > > Yes, if you read the blog post it specifies that he's using three separate > machines. There's no reason the producers cannot be started at the same > time, I believe. > > On Tue, Jul 14, 2015 at 11:42 AM, Yuheng Du > wrote: > > > Hi, > > > > I am running the performance test for kafka. > > https://gist.github.com/jkreps > > /c7ddb4041ef62a900e6c > > > > For the "Three Producers, 3x async replication" scenario, the command is > > the same as one producer: > > > > bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance > > test 5000 100 -1 acks=1 > > bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092 > > buffer.memory=67108864 batch.size=8196 > > > > So How to I run the test for three producers? Do I just run them on three > > separate servers at the same time? Will there be some error in this way > > since the three producers can't be started at the same time? > > > > Thanks. > > > > best, > > Yuheng > > > > > > -- > > Jiefu Gong > University of California, Berkeley | Class of 2017 > B.A Computer Science | College of Letters and Sciences > > jg...@berkeley.edu | (925) 400-3427 >
How to run the three producers test
Hi, I am running the performance test for kafka. https://gist.github.com/jkreps /c7ddb4041ef62a900e6c For the "Three Producers, 3x async replication" scenario, the command is the same as one producer: bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test 5000 100 -1 acks=1 bootstrap.servers=esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196 So How to I run the test for three producers? Do I just run them on three separate servers at the same time? Will there be some error in this way since the three producers can't be started at the same time? Thanks. best, Yuheng
Latency test
Currently, the latency test from kafka test the end to end latency between producers and consumers. Is there a way to test the producer to broker and broker to consumer delay seperately? Thanks.
Re: kafka benchmark tests
Also, I guess setting the target throughput to -1 means let it be as high as possible? On Tue, Jul 14, 2015 at 10:36 AM, Yuheng Du wrote: > Thanks. If I set the acks=1 in the producer config options in > bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance > test7 5000 100 -1 acks=1 bootstrap.servers= > esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196? > > Does that mean for each message generated at the producer, the producer > will wait until the broker sends the ack back, then send another message? > > Thanks. > > Yuheng > > On Tue, Jul 14, 2015 at 10:06 AM, Manikumar Reddy > wrote: > >> Yes, A list of Kafka Server host/port pairs to use for establishing the >> initial connection to the Kafka cluster >> >> https://kafka.apache.org/documentation.html#newproducerconfigs >> >> On Tue, Jul 14, 2015 at 7:29 PM, Yuheng Du >> wrote: >> >> > Does anyone know what is bootstrap.servers= >> > esv4-hcl198.grid.linkedin.com:9092 means in the following test command: >> > >> > bin/kafka-run-class.sh >> org.apache.kafka.clients.tools.ProducerPerformance >> > test7 5000 100 -1 acks=1 bootstrap.servers= >> > esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 >> batch.size=8196? >> > >> > what is bootstrap.servers? Is it the kafka server that I am running a >> test >> > at? >> > >> > Thanks. >> > >> > Yuheng >> > >> > >> > >> > >> > On Tue, Jul 14, 2015 at 12:18 AM, Ewen Cheslack-Postava < >> e...@confluent.io >> > > >> > wrote: >> > >> > > I implemented (nearly) the same basic set of tests in the system test >> > > framework we started at Confluent and that is going to move into >> Kafka -- >> > > see the wip patch for KIP-25 here: >> > https://github.com/apache/kafka/pull/70 >> > > In particular, that test is implemented in benchmark_test.py: >> > > >> > > >> > >> https://github.com/apache/kafka/pull/70/files#diff-ca984778cf9943407645eb6784f19dc8 >> > > >> > > Hopefully once that's merged people can reuse that benchmark (and add >> to >> > > it!) so they can easily run the same benchmarks across different >> > hardware. >> > > Here are some results from an older version of that test on m3.2xlarge >> > > instances on EC2 using local ephemeral storage (I think... it's been >> > awhile >> > > since I ran these numbers and I didn't document methodology that >> > > carefully): >> > > >> > > INFO:_.KafkaBenchmark:= >> > > INFO:_.KafkaBenchmark:BENCHMARK RESULTS >> > > INFO:_.KafkaBenchmark:= >> > > INFO:_.KafkaBenchmark:Single producer, no replication: 684097.470208 >> > > rec/sec (65.24 MB/s) >> > > INFO:_.KafkaBenchmark:Single producer, async 3x replication: >> > > 667494.359673 rec/sec (63.66 MB/s) >> > > INFO:_.KafkaBenchmark:Single producer, sync 3x replication: >> > > 116485.764275 rec/sec (11.11 MB/s) >> > > INFO:_.KafkaBenchmark:Three producers, async 3x replication: >> > > 1696519.022182 rec/sec (161.79 MB/s) >> > > INFO:_.KafkaBenchmark:Message size: >> > > INFO:_.KafkaBenchmark: 10: 1637825.195625 rec/sec (15.62 MB/s) >> > > INFO:_.KafkaBenchmark: 100: 605504.877911 rec/sec (57.75 MB/s) >> > > INFO:_.KafkaBenchmark: 1000: 90351.817570 rec/sec (86.17 MB/s) >> > > INFO:_.KafkaBenchmark: 1: 8306.180862 rec/sec (79.21 MB/s) >> > > INFO:_.KafkaBenchmark: 10: 978.403499 rec/sec (93.31 MB/s) >> > > INFO:_.KafkaBenchmark:Throughput over long run, data > memory: >> > > INFO:_.KafkaBenchmark: Time block 0: 684725.151324 rec/sec (65.30 >> > MB/s) >> > > INFO:_.KafkaBenchmark:Single consumer: 701031.14 rec/sec >> (56.830500 >> > > MB/s) >> > > INFO:_.KafkaBenchmark:Three consumers: 3304011.014900 rec/sec >> (267.830800 >> > > MB/s) >> > > INFO:_.KafkaBenchmark:Producer + consumer: >> > > INFO:_.KafkaBenchmark: Producer: 624984.375391 rec/sec (59.60 >> MB/s) >> > > INFO:_.KafkaBenchmark: Consumer: 624984.375391 rec/sec (59.60 >> MB/s) >> > > INFO:_.KafkaBenchmark:End-to-end latency: median 2.00 ms, 99% >> > > 4.00 ms, 99.9% 19.00 ms >> > > &g
Re: kafka benchmark tests
Thanks. If I set the acks=1 in the producer config options in bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1 acks=1 bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196? Does that mean for each message generated at the producer, the producer will wait until the broker sends the ack back, then send another message? Thanks. Yuheng On Tue, Jul 14, 2015 at 10:06 AM, Manikumar Reddy wrote: > Yes, A list of Kafka Server host/port pairs to use for establishing the > initial connection to the Kafka cluster > > https://kafka.apache.org/documentation.html#newproducerconfigs > > On Tue, Jul 14, 2015 at 7:29 PM, Yuheng Du > wrote: > > > Does anyone know what is bootstrap.servers= > > esv4-hcl198.grid.linkedin.com:9092 means in the following test command: > > > > bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance > > test7 5000 100 -1 acks=1 bootstrap.servers= > > esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 > batch.size=8196? > > > > what is bootstrap.servers? Is it the kafka server that I am running a > test > > at? > > > > Thanks. > > > > Yuheng > > > > > > > > > > On Tue, Jul 14, 2015 at 12:18 AM, Ewen Cheslack-Postava < > e...@confluent.io > > > > > wrote: > > > > > I implemented (nearly) the same basic set of tests in the system test > > > framework we started at Confluent and that is going to move into Kafka > -- > > > see the wip patch for KIP-25 here: > > https://github.com/apache/kafka/pull/70 > > > In particular, that test is implemented in benchmark_test.py: > > > > > > > > > https://github.com/apache/kafka/pull/70/files#diff-ca984778cf9943407645eb6784f19dc8 > > > > > > Hopefully once that's merged people can reuse that benchmark (and add > to > > > it!) so they can easily run the same benchmarks across different > > hardware. > > > Here are some results from an older version of that test on m3.2xlarge > > > instances on EC2 using local ephemeral storage (I think... it's been > > awhile > > > since I ran these numbers and I didn't document methodology that > > > carefully): > > > > > > INFO:_.KafkaBenchmark:= > > > INFO:_.KafkaBenchmark:BENCHMARK RESULTS > > > INFO:_.KafkaBenchmark:= > > > INFO:_.KafkaBenchmark:Single producer, no replication: 684097.470208 > > > rec/sec (65.24 MB/s) > > > INFO:_.KafkaBenchmark:Single producer, async 3x replication: > > > 667494.359673 rec/sec (63.66 MB/s) > > > INFO:_.KafkaBenchmark:Single producer, sync 3x replication: > > > 116485.764275 rec/sec (11.11 MB/s) > > > INFO:_.KafkaBenchmark:Three producers, async 3x replication: > > > 1696519.022182 rec/sec (161.79 MB/s) > > > INFO:_.KafkaBenchmark:Message size: > > > INFO:_.KafkaBenchmark: 10: 1637825.195625 rec/sec (15.62 MB/s) > > > INFO:_.KafkaBenchmark: 100: 605504.877911 rec/sec (57.75 MB/s) > > > INFO:_.KafkaBenchmark: 1000: 90351.817570 rec/sec (86.17 MB/s) > > > INFO:_.KafkaBenchmark: 1: 8306.180862 rec/sec (79.21 MB/s) > > > INFO:_.KafkaBenchmark: 10: 978.403499 rec/sec (93.31 MB/s) > > > INFO:_.KafkaBenchmark:Throughput over long run, data > memory: > > > INFO:_.KafkaBenchmark: Time block 0: 684725.151324 rec/sec (65.30 > > MB/s) > > > INFO:_.KafkaBenchmark:Single consumer: 701031.14 rec/sec (56.830500 > > > MB/s) > > > INFO:_.KafkaBenchmark:Three consumers: 3304011.014900 rec/sec > (267.830800 > > > MB/s) > > > INFO:_.KafkaBenchmark:Producer + consumer: > > > INFO:_.KafkaBenchmark: Producer: 624984.375391 rec/sec (59.60 MB/s) > > > INFO:_.KafkaBenchmark: Consumer: 624984.375391 rec/sec (59.60 MB/s) > > > INFO:_.KafkaBenchmark:End-to-end latency: median 2.00 ms, 99% > > > 4.00 ms, 99.9% 19.00 ms > > > > > > Don't trust these numbers for anything, the were a quick one-off test. > > I'm > > > just pasting the output so you get some idea of what the results might > > look > > > like. Once we merge the KIP-25 patch, Confluent will be running the > tests > > > regularly and results will be available publicly so we'll be able to > keep > > > better tabs on performance, albeit for only a specific class of > hardware. > > > > > > For the batch.size question -- I'm not sure the resu
Re: kafka benchmark tests
Does anyone know what is bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 means in the following test command: bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1 acks=1 bootstrap.servers= esv4-hcl198.grid.linkedin.com:9092 buffer.memory=67108864 batch.size=8196? what is bootstrap.servers? Is it the kafka server that I am running a test at? Thanks. Yuheng On Tue, Jul 14, 2015 at 12:18 AM, Ewen Cheslack-Postava wrote: > I implemented (nearly) the same basic set of tests in the system test > framework we started at Confluent and that is going to move into Kafka -- > see the wip patch for KIP-25 here: https://github.com/apache/kafka/pull/70 > In particular, that test is implemented in benchmark_test.py: > > https://github.com/apache/kafka/pull/70/files#diff-ca984778cf9943407645eb6784f19dc8 > > Hopefully once that's merged people can reuse that benchmark (and add to > it!) so they can easily run the same benchmarks across different hardware. > Here are some results from an older version of that test on m3.2xlarge > instances on EC2 using local ephemeral storage (I think... it's been awhile > since I ran these numbers and I didn't document methodology that > carefully): > > INFO:_.KafkaBenchmark:= > INFO:_.KafkaBenchmark:BENCHMARK RESULTS > INFO:_.KafkaBenchmark:= > INFO:_.KafkaBenchmark:Single producer, no replication: 684097.470208 > rec/sec (65.24 MB/s) > INFO:_.KafkaBenchmark:Single producer, async 3x replication: > 667494.359673 rec/sec (63.66 MB/s) > INFO:_.KafkaBenchmark:Single producer, sync 3x replication: > 116485.764275 rec/sec (11.11 MB/s) > INFO:_.KafkaBenchmark:Three producers, async 3x replication: > 1696519.022182 rec/sec (161.79 MB/s) > INFO:_.KafkaBenchmark:Message size: > INFO:_.KafkaBenchmark: 10: 1637825.195625 rec/sec (15.62 MB/s) > INFO:_.KafkaBenchmark: 100: 605504.877911 rec/sec (57.75 MB/s) > INFO:_.KafkaBenchmark: 1000: 90351.817570 rec/sec (86.17 MB/s) > INFO:_.KafkaBenchmark: 1: 8306.180862 rec/sec (79.21 MB/s) > INFO:_.KafkaBenchmark: 10: 978.403499 rec/sec (93.31 MB/s) > INFO:_.KafkaBenchmark:Throughput over long run, data > memory: > INFO:_.KafkaBenchmark: Time block 0: 684725.151324 rec/sec (65.30 MB/s) > INFO:_.KafkaBenchmark:Single consumer: 701031.14 rec/sec (56.830500 > MB/s) > INFO:_.KafkaBenchmark:Three consumers: 3304011.014900 rec/sec (267.830800 > MB/s) > INFO:_.KafkaBenchmark:Producer + consumer: > INFO:_.KafkaBenchmark: Producer: 624984.375391 rec/sec (59.60 MB/s) > INFO:_.KafkaBenchmark: Consumer: 624984.375391 rec/sec (59.60 MB/s) > INFO:_.KafkaBenchmark:End-to-end latency: median 2.00 ms, 99% > 4.00 ms, 99.9% 19.00 ms > > Don't trust these numbers for anything, the were a quick one-off test. I'm > just pasting the output so you get some idea of what the results might look > like. Once we merge the KIP-25 patch, Confluent will be running the tests > regularly and results will be available publicly so we'll be able to keep > better tabs on performance, albeit for only a specific class of hardware. > > For the batch.size question -- I'm not sure the results in the blog post > actually have different settings, it could be accidental divergence between > the script and the blog post. The post specifically notes that tuning the > batch size in the synchronous case might help, but that he didn't do that. > If you're trying to benchmark the *optimal* throughput, tuning the batch > size would make sense. Since synchronous replication will have higher > latency and there's a limit to how many requests can be in flight at once, > you'll want a larger batch size to compensate for the additional latency. > However, in practice the increase you see may be negligible. Somebody who > has spent more time fiddling with tweaking producer performance may have > more insight. > > -Ewen > > On Mon, Jul 13, 2015 at 10:08 AM, JIEFU GONG wrote: > > > Hi all, > > > > I was wondering if any of you guys have done benchmarks on Kafka > > performance before, and if they or their details (# nodes in cluster, # > > records / size(s) of messages, etc.) could be shared. > > > > For comparison purposes, I am trying to benchmark Kafka against some > > similar services such as Kinesis or Scribe. Additionally, I was wondering > > if anyone could shed some insight on Jay Kreps' benchmarks that he has > > openly published here: > > > > > https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines > > > > Specifically, I am unsure of why between his tests of 3x synchronous > > replication and 3x async replication he changed the batch.size, as well > as > > why he is seemingly publishing to incorrect topics: > > > > Configs: > > https://gist.github.com/jkreps/c7ddb4041ef62a900e6c > > > > Any help is greatly appreciated! > > > > > > > > -- > > > > Jiefu Gong > > University of California, Berkeley
Re: performance benchmarking of kafka
Hi, Appreciate your response. It works now! It is just a typo of the class names : (. It really has nothing to do with whether you are using the binaries or the source version of kafka. Thanks everyone! On Mon, Jul 13, 2015 at 11:18 PM, tao xiao wrote: > org.apache.kafka.clients.tools.ProducerPerformance resides in > kafka-clients-0.8.2.1.jar. > You need to make sure the jar exists in $KAFKA_HOME/libs/. I use > kafka_2.10-0.8.2.1 > too and here is the output > > % bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance > > USAGE: java org.apache.kafka.clients.tools.ProducerPerformance topic_name > num_records record_size target_records_sec [prop_name=prop_value]* > > > > On Tue, 14 Jul 2015 at 05:08 Yuheng Du wrote: > > > I am using the binaries of kafka_2.10-0.8.2.1. Could that be the problem? > > Should I use the source of kafka-0.8.2.1-src.tgz to each of my machiines, > > build them and run the test? > > Thanks. > > > > On Mon, Jul 13, 2015 at 4:37 PM, JIEFU GONG wrote: > > > > > You may need to open up your run-class.sh in a text editor and modify > the > > > classpath -- I believe I had a similar error before. > > > > > > On Mon, Jul 13, 2015 at 1:16 PM, Yuheng Du > > > wrote: > > > > > > > Hi guys, > > > > > > > > I am trying to replicate the test of benchmarking kafka at > > > > > > > > > > > > > > http://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines > > > > . > > > > > > > > When I run > > > > > > > > bin/kafka-run-class.sh > > org.apache.kafka.clients.tools.ProducerPerformance > > > > test7 5000 100 -1 acks=1 bootstrap.servers=192.168.1.1:9092 > > > > buffer.memory=67108864 batch.size=8196 > > > > > > > > and I got the following error: > > > > Error: Could not find or load main class > > > > org.apache.kafka.client.tools.ProducerPerformance > > > > > > > > What should I fix? Thank you! > > > > > > > > > > > > > > > > -- > > > > > > Jiefu Gong > > > University of California, Berkeley | Class of 2017 > > > B.A Computer Science | College of Letters and Sciences > > > > > > jg...@berkeley.edu | (925) 400-3427 > > > > > >
Re: performance benchmarking of kafka
I am using the binaries of kafka_2.10-0.8.2.1. Could that be the problem? Should I use the source of kafka-0.8.2.1-src.tgz to each of my machiines, build them and run the test? Thanks. On Mon, Jul 13, 2015 at 4:37 PM, JIEFU GONG wrote: > You may need to open up your run-class.sh in a text editor and modify the > classpath -- I believe I had a similar error before. > > On Mon, Jul 13, 2015 at 1:16 PM, Yuheng Du > wrote: > > > Hi guys, > > > > I am trying to replicate the test of benchmarking kafka at > > > > > http://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines > > . > > > > When I run > > > > bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance > > test7 5000 100 -1 acks=1 bootstrap.servers=192.168.1.1:9092 > > buffer.memory=67108864 batch.size=8196 > > > > and I got the following error: > > Error: Could not find or load main class > > org.apache.kafka.client.tools.ProducerPerformance > > > > What should I fix? Thank you! > > > > > > -- > > Jiefu Gong > University of California, Berkeley | Class of 2017 > B.A Computer Science | College of Letters and Sciences > > jg...@berkeley.edu | (925) 400-3427 >
Re: performance benchmarking of kafka
Thank you. I see that in run-class.sh, they have the following lines: 63 for file in $base_dir/clients/build/libs/kafka-clients*.jar; 64 do 65 CLASSPATH=$CLASSPATH:$file 66 done So I believe all the jars in the libs/ directory have already been included in the classpath? Which directory is the ProducerPerformance class resides? Thanks. On Mon, Jul 13, 2015 at 4:37 PM, JIEFU GONG wrote: > You may need to open up your run-class.sh in a text editor and modify the > classpath -- I believe I had a similar error before. > > On Mon, Jul 13, 2015 at 1:16 PM, Yuheng Du > wrote: > > > Hi guys, > > > > I am trying to replicate the test of benchmarking kafka at > > > > > http://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines > > . > > > > When I run > > > > bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance > > test7 5000 100 -1 acks=1 bootstrap.servers=192.168.1.1:9092 > > buffer.memory=67108864 batch.size=8196 > > > > and I got the following error: > > Error: Could not find or load main class > > org.apache.kafka.client.tools.ProducerPerformance > > > > What should I fix? Thank you! > > > > > > -- > > Jiefu Gong > University of California, Berkeley | Class of 2017 > B.A Computer Science | College of Letters and Sciences > > jg...@berkeley.edu | (925) 400-3427 >
performance benchmarking of kafka
Hi guys, I am trying to replicate the test of benchmarking kafka at http://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines . When I run bin/kafka-run-class.sh org.apache.kafka.clients.tools.ProducerPerformance test7 5000 100 -1 acks=1 bootstrap.servers=192.168.1.1:9092 buffer.memory=67108864 batch.size=8196 and I got the following error: Error: Could not find or load main class org.apache.kafka.client.tools.ProducerPerformance What should I fix? Thank you!
Re: A kafka web monitor
Hi Wan, I tried to install this DCMonitor, but when I try to clone the project, but it gives me "Permission denied, the remote end hung up unexpectedly". Can you provide any suggestions to this issue? Thanks. best, Yuheng On Mon, Mar 23, 2015 at 8:54 AM, Wan Wei wrote: > We have make a simple web console to monitor some kafka informations like > consumer offset, logsize. > > https://github.com/shunfei/DCMonitor > > > Hope you like it and offer your help to make it better :) > Regards > Flow >
Re: kafka topic information
Thanks, got it! best, Yuheng On Mon, Mar 9, 2015 at 11:52 AM, Harsha wrote: > In general users are expected to run zookeeper cluster of 3 or 5 nodes. > Zookeeper requires quorum of servers running which means at least ceil(n/2) > servers need to be up. For 3 zookeeper nodes there needs to be atleast 2 zk > nodes up at any time , i.e your cluster can function fine incase of 1 > machine failure and incase of 5 there should be at least 3 nodes to be up > and running. For more info on zookeeper you can look under here > http://zookeeper.apache.org/doc/r3.4.6/ > http://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html > > > -- > Harsha > > On March 9, 2015 at 8:39:00 AM, Yuheng Du (yuheng.du.h...@gmail.com) > wrote: > > Harsha, > > Thanks for reply. So what if the zookeeper cluster fails? Will the topics > information be lost? What fault-tolerant mechanism does zookeeper offer? > > best, > > On Mon, Mar 9, 2015 at 11:36 AM, Harsha wrote: > >> Yuheng, >>kafka keeps cluster metadata in zookeeper along with topic >> metadata as well. You can use zookeeper-shell.sh or zkCli.sh to check zk >> nodes, /brokers/topics will give you the list of topics . >> >> -- >> Harsha >> >> >> On March 9, 2015 at 8:20:59 AM, Yuheng Du (yuheng.du.h...@gmail.com) >> wrote: >> >> I am wondering where does kafka cluster keep the topic metadata (name, >> partition, replication, etc)? How does a server recover the topic's >> metadata and messages after restart and what data will be lost? >> >> Thanks for anyone to answer my questions. >> >> best, >> Yuheng >> >> >
Re: kafka topic information
Harsha, Thanks for reply. So what if the zookeeper cluster fails? Will the topics information be lost? What fault-tolerant mechanism does zookeeper offer? best, On Mon, Mar 9, 2015 at 11:36 AM, Harsha wrote: > Yuheng, > kafka keeps cluster metadata in zookeeper along with topic > metadata as well. You can use zookeeper-shell.sh or zkCli.sh to check zk > nodes, /brokers/topics will give you the list of topics . > > -- > Harsha > > > On March 9, 2015 at 8:20:59 AM, Yuheng Du (yuheng.du.h...@gmail.com) > wrote: > > I am wondering where does kafka cluster keep the topic metadata (name, > partition, replication, etc)? How does a server recover the topic's > metadata and messages after restart and what data will be lost? > > Thanks for anyone to answer my questions. > > best, > Yuheng > >
kafka topic information
I am wondering where does kafka cluster keep the topic metadata (name, partition, replication, etc)? How does a server recover the topic's metadata and messages after restart and what data will be lost? Thanks for anyone to answer my questions. best, Yuheng
Re: Set up kafka cluster
will do, Thanks! On Thu, Mar 5, 2015 at 3:35 PM, Gwen Shapira wrote: > Did you take a look at the quick-start guide? > https://kafka.apache.org/082/quickstart.html > > It shows how to set up a single node, how to validate that its working > and then how to set up multi-node cluster. > > Good luck! > > On Thu, Mar 5, 2015 at 12:30 PM, Yuheng Du > wrote: > > Thank you Gwen, > > > > I also need the kafka cluster continue to provide message brokering > service > > to a Storm cluster after the benchmarking. I am fairly new to cluster > > setups. So is there an instruction telling me how to set up the > three-node > > kafka cluster before running benchmarking? That would be really helpful. > > > > Thanks gain. > > > > best, > > Yuheng > > > > On Thu, Mar 5, 2015 at 3:22 PM, Gwen Shapira > wrote: > > > >> Jay Kreps has a gist with step by step instructions for reproducing > >> the benchmarks used by LinkedIn: > >> https://gist.github.com/jkreps/c7ddb4041ef62a900e6c > >> > >> And the blog with the results: > >> > >> > https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines > >> > >> Gwen > >> > >> On Thu, Mar 5, 2015 at 12:16 PM, Yuheng Du > >> wrote: > >> > Hi everyone, > >> > > >> > I am trying to set up a kafka cluster consisting of three machines. I > >> wanna > >> > run a benchmarking program in them. Can anyone recommend a step by > step > >> > tutorial/instruction of how I can do it? > >> > > >> > Thanks. > >> > > >> > best, > >> > Yuheng > >> >
Re: Set up kafka cluster
Thank you Gwen, I also need the kafka cluster continue to provide message brokering service to a Storm cluster after the benchmarking. I am fairly new to cluster setups. So is there an instruction telling me how to set up the three-node kafka cluster before running benchmarking? That would be really helpful. Thanks gain. best, Yuheng On Thu, Mar 5, 2015 at 3:22 PM, Gwen Shapira wrote: > Jay Kreps has a gist with step by step instructions for reproducing > the benchmarks used by LinkedIn: > https://gist.github.com/jkreps/c7ddb4041ef62a900e6c > > And the blog with the results: > > https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines > > Gwen > > On Thu, Mar 5, 2015 at 12:16 PM, Yuheng Du > wrote: > > Hi everyone, > > > > I am trying to set up a kafka cluster consisting of three machines. I > wanna > > run a benchmarking program in them. Can anyone recommend a step by step > > tutorial/instruction of how I can do it? > > > > Thanks. > > > > best, > > Yuheng >
Set up kafka cluster
Hi everyone, I am trying to set up a kafka cluster consisting of three machines. I wanna run a benchmarking program in them. Can anyone recommend a step by step tutorial/instruction of how I can do it? Thanks. best, Yuheng