Re: Kafka - deployment size and topologies

Todd Palino Mon, 06 Apr 2015 10:28:02 -0700

Luckily, I was just reviewing a lot of this information for my ApacheCon
talk next week. Those slides, and the video (I hope) will be published as
soon as the talk is done. I'll give you the information I have from
LinkedIn's point of view, but out of order :)


Our Kafka brokers are all the same model. We use a system with 12 CPU
cores, currently 2.6 GHz, with hyperthreading enabled. They have 64 GB of
memory, and dual 1G network interfaces that are bonded, but operating
active/passive. The systems have 16 1 TB SAS drives in them: 2 are
configured as RAID-1 for the OS, and the other 14 are configured as RAID-10
specifically for the Kafka log segments. This gives us a little under 7 TB
of useable space for message retention per broker.

On layout, we try to follow a few rules with varying consistency (we're
getting more strict over time):
    - We do not colocate other applications with Kafka. It gets the entire
system to itself
    - Zookeeper runs on 5 separate servers (also not colocated with other
applications). Those servers have the same CPU, memory, and network spec,
but they do not have all the disks. They do have 550GB SSD drives which are
dedicated for the ZK transaction logs.
    - We try not to have more than 1 Kafka broker in a cluster in the same
rack. This is to minimize the kinds of failures that can take partitions
offline
    - All Kafka producers and consumers are local to the datacenter that
the cluster is in. We use mirror maker and aggregate clusters to copy
messages between datacenters.

Our smallest cluster is currently 3 brokers, and our largest is 42. It
largely depends on how much retention we need, and how much traffic that
cluster is getting. Our clusters are separated out by the general type of
traffic: queuing, tracking, metrics, and logging. Queuing clusters are
generally the smallest, while metrics clusters are the largest (with
tracking close behind). We expand clusters based on the following loose
rules:
    - Disk usage on the log segments partition should stay under 60% (we
have default 4 day retention)
    - Network usage on each broker should stay under 75%
    - Partition count (leader and follower combined) on each broker should
stay under 4000

As far as topic volume goes, it varies widely. We have topics that only see
a single message per minute (or less). Our largest topic by bytes has a
peak rate of about 290 Mbits/sec. Our largest topic by messages has a peak
rate of about 225k messages/sec. Note that those are in the same cluster.
When we are sizing topics (number of partitions), we use the following
guidelines:
    - Have at least as many partitions as there are consumers in the
largest group
    - Keep partition size on disk under 50GB per partition (better balance)
    - Take into account any other application requirements (keyed messages,
specific topic counts required, etc.)

I hope this helps. I'll be covering some of this at my ApacheCon talk
(Kafka at Scale: Multi-Tier Architectures) and at the meet up that Jun has
set up at ApacheCon. If you have any questions, just ask!

-Todd


On Mon, Apr 6, 2015 at 9:35 AM, Rama Ramani <rama.ram...@live.com> wrote:

> Hello,
>           I am trying to understand some of the common Kafka deployment
> sizes ("small", "medium", "large") and configuration to come up with a set
> of common templates for deployment on Linux. Some of the Qs to answer are:
>
> - Number of nodes in the cluster
> - Machine Specs (cpu, memory, number of disks, network etc.)
> - Speeds & Feeds of messages
> - What are some of the best practices to consider when laying out the
> clusters?
> -  Is there a sizing calculator for coming up with this?
>
> If you can please share pointers to existing materials or specific details
> of your deployment, that will be great.
>
> Regards
> Rama
>

Re: Kafka - deployment size and topologies

Reply via email to