Re: Post on running Kafka at LinkedIn

Todd Palino Mon, 23 Mar 2015 06:44:22 -0700

Emmanuel, if it helps, here's a little more detail on the hardware spec we
are using at the moment:


12 CPU (HT enabled)
64 GB RAM
16 x 1TB SAS drives (2 are used as a RAID-1 set for the OS, 14 are a
RAID-10 set just for the Kafka log segments)

We don't colocate any other applications with Kafka except for a couple
monitoring agents. Zookeeper runs on completely separate nodes.

I suggest starting with looking at the basics - watch the CPU, memory, and
disk IO usage on the brokers as you are testing. You're likely going to
find one of these three is the constraint. Disk IO in particular can lead
to a significant increase in produce latency as it increases even over
10-15% utilization.

-Todd


On Fri, Mar 20, 2015 at 3:41 PM, Emmanuel <ele...@msn.com> wrote:

> This is why I'm confused because I'm tryign to benchmark and I see numbers
> that seem pretty low to me...8000 events/sec on 2 brokers with 3CPU each
> and 5 partitions should be way faster than this and I don't know where to
> start to debug...
> the kafka-consumer-perf-test script gives me ridiculously low numbers
> (1000 events/sec/thread)
>
> So what could be causing this?
> From: jbringhu...@linkedin.com.INVALID
> To: users@kafka.apache.org
> Subject: Re: Post on running Kafka at LinkedIn
> Date: Fri, 20 Mar 2015 22:16:29 +0000
>
> Keep in mind that these brokers aren't really stressed too much at any
> given time -- we need to stay ahead of the capacity curve.
> Your message throughput will really just depend on what hardware you're
> using. However, in the past, we've benchmarked at 400,000 to more than
> 800,000 messages / broker / sec, depending on configuration (
> https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines
> ).
>
> -Jon
> On Mar 20, 2015, at 3:03 PM, Emmanuel <ele...@msn.com> wrote:800B
> messages / day = 9.26M messages / sec over 1100 brokers
> = ~8400 message / broker / sec
> Do I get this right?
> Trying to benchmark my own test cluster and that's what I see with 2
> brokers...Just wondering if my numbers are good or bad...
>
>
> Subject: Re: Post on running Kafka at LinkedIn
> From: cl...@kafka.guru
> Date: Fri, 20 Mar 2015 14:27:58 -0700
> To: users@kafka.apache.org
>
> Yep! We are growing :)
>
> -Clark
>
> Sent from my iPhone
>
> On Mar 20, 2015, at 2:14 PM, James Cheng <jch...@tivo.com> wrote:
>
> Amazing growth numbers.
>
> At the meetup on 1/27, Clark Haskins presented their Kafka usage at the
> time. It was:
>
> Bytes in: 120 TB
> Messages In: 585 million
> Bytes out: 540 TB
> Total brokers: 704
>
> In Todd's post, the current numbers:
>
> Bytes in: 175 TB (45% growth)
> Messages In: 800 billion (36% growth)
> Bytes out: 650 TB (20% growth)
> Total brokers: 1100 (56% growth)
>
> That much growth in just 2 months? Wowzers.
>
> -James
>
> On Mar 20, 2015, at 11:30 AM, James Cheng <jch...@tivo.com> wrote:
>
> For those who missed it:
>
> The Kafka Audit tool was also presented at the 1/27 Kafka meetup:
> http://www.meetup.com/http-kafka-apache-org/events/219626780/
>
> Recorded video is here, starting around the 40 minute mark:
> http://www.ustream.tv/recorded/58109076
>
> Slides are here:
> http://www.ustream.tv/recorded/58109076
>
> -James
>
> On Mar 20, 2015, at 9:47 AM, Todd Palino <tpal...@gmail.com> wrote:
>
> For those who are interested in detail on how we've got Kafka set up at
> LinkedIn, I have just published a new posted to our Engineering blog titled
> "Running Kafka at Scale"
>
>   https://engineering.linkedin.com/kafka/running-kafka-scale
>
> It's a general overview of our current Kafka install, tiered architecture,
> audit, and the libraries we use for producers and consumers. You'll also be
> seeing more posts from the SRE team here in the coming weeks on deeper
> looks into both Kafka and Samza.
>
> Additionally, I'll be giving a talk at ApacheCon next month on running
> tiered Kafka architectures. If you're in Austin for that, please come by
> and check it out.
>
> -Todd
>
>
>
>

Re: Post on running Kafka at LinkedIn

Reply via email to