Re: Scaling issues with MSK on AWS

2020-09-09 Thread Ben Weintraub
Hi Arti,

>From your description, it sounds like your per-partition throughput is
extremely high (380 MB/s across three partitions is ~125 MB/s per
partition). A more typical high-end range would be 5-10 MB/s per partition.
Each follower gets mapped to a single replica fetcher thread on the
follower side, which maintains a single TCP connection to the leader, and
that connection is in turn mapped to a single network processor thread on
the leader side. With the setup you described, you have a total of 3
[partitions] * (2 - 1) [replicas per partition, minus one for the leader]
total followers = 3 across the whole cluster, but 10 network threads per
broker. In this scenario, you're likely not actually spreading the
replication traffic across those multiple network processor threads. This
means that you're probably pinning a single core per broker doing TLS for
replication, while the others are mostly idle (or servicing client
requests).

Bumping your partition count will likely allow you to spread replication
load across more of your network processor threads, and achieve higher CPU
utilization.

In addition, it's important to consider not only disk usage but disk write
throughput. With 380 MB/s of ingress traffic and an RF of 2, you'll need a
total of 380 * 2 = 760 MB/s of disk write throughput, or about 250 MB/s per
broker. Coincidentally (or not), MSK brokers use gp2 EBS volumes

for storing the Kafka log directory, which means you're limited to an
absolute max of 250 MiB/s of write throughput per broker. You probably want
to allow yourself some headroom here too, so I'd suggest that you start
from your desired ingress throughput and replication factor, plus your
desired safety margin, and then work backwards to figure out how many gp2
volumes you'll need to achieve that level of write throughput safely. Also
note that while CPU and RAM scale with the larger MSK instance types, disk
throughput does not.

Cheers,
Ben

On Wed, Sep 9, 2020 at 7:43 AM Arti Pande  wrote:

> Hi,
>
> We have been evaluating Managed Streaming for Kafka (MSK) on AWS for a
> use-case that requires high-speed data ingestion of the order of millions
> of messages (each ~1 KB size) per second. We ran into some issues when
> testing this case.
>
> Context:
> To start with, we have set up single topic with 3 partitions on a 3 node
> MSK of m5.large (2 cores, 8 GB RAM, 500 GB EBS) with encryption enabled for
> inter-broker (intra-MSK) communication. Each broker is in a separate AZ
> (total 3 AZs and 3 brokers) and has 10 network threads and 16 IO threads.
>
> When the topic has replication-factor = 2  and min.insync.replicas = 2 and
> publisher uses acks = all, when sending 100+ million messages using 3
> parallel publishers intermittently results in following error.
>   `Delivery failed: Broker: Not enough in-sync replicas`
> As per documentation this error is thrown when ins-sync replicas are
> lagging behind for more than a configured duration (
> replica.lag.time.max.ms=30 seconds as default).
>
> However when we don't see this error, the throughput is around 90 K
> msgs/sec i.e. 90 MB/sec. CPU usage is below 50% disk usage is also < 20%.
> So apparently CPU/Memory/Disk are not an issue ??
>
> If we change replication-factor =1 and min.insync.replicas = 1 and/or
> ack=1 and keep all other things same, then there are no errors and
> throughput is ~380 K msgs.sec i.e. 380 MB/sec. CPU usage was below < 30 %
>
> Question:
> Without replication we were able to get 380 MB/sec written, so assuming
> disk or CPU or memory are not an issue. what could be the reason for
> replicas to lag behind at 90 MB/sec throughput? Is it the number of total
> threads (10 n/w + 16 IO) being too high for a 2 core machine? But then same
> thread setting works good without replication. What could be the reason for
> (1) lesser throughput when turning replication on and (2) replicas lagging
> behind when replication is turned on?
>
> Thanks
> Arti
>


Re: New Website Layout

2020-08-06 Thread Ben Weintraub
Plus one to Tom's request - the ability to easily generate links to
specific config options is extremely valuable.

On Thu, Aug 6, 2020 at 10:09 AM Tom Bentley  wrote:

> Hi Ben,
>
> The documentation for the configs (broker, producer etc) used to function
> as links as well as anchors, which made the url fragments more
> discoverable, because you could click on the link and then copy+paste the
> browser URL:
>
> 
>href="#batch.size">batch.size
> 
>
> What seems to have happened with the new layout is the  tags are empty,
> and no longer enclose the config name,
>
> 
>   
>   batch.size
> 
>
> meaning you can't click on the link to copy and paste the URL. Could the
> old behaviour be restored?
>
> Thanks,
>
> Tom
>
> On Wed, Aug 5, 2020 at 12:43 PM Luke Chen  wrote:
>
> > When entering streams doc, it'll always show:
> > *You're viewing documentation for an older version of Kafka - check out
> our
> > current documentation here.*
> >
> >
> >
> > On Wed, Aug 5, 2020 at 6:44 PM Ben Stopford  wrote:
> >
> > > Thanks for the PR and feedback Michael. Appreciated.
> > >
> > > On Wed, 5 Aug 2020 at 10:49, Mickael Maison 
> > > wrote:
> > >
> > > > Thank you, it looks great!
> > > >
> > > > I found a couple of small issues:
> > > > - It's not rendering correctly with http.
> > > > - It's printing "called" to the console. I opened a PR to remove the
> > > > console.log() call: https://github.com/apache/kafka-site/pull/278
> > > >
> > > > On Wed, Aug 5, 2020 at 9:45 AM Ben Stopford 
> wrote:
> > > > >
> > > > > The new website layout has gone live as you may have seen. There
> are
> > a
> > > > > couple of rendering issues in the streams developer guide that
> we're
> > > > > getting addressed. If anyone spots anything else could they please
> > > reply
> > > > to
> > > > > this thread.
> > > > >
> > > > > Thanks
> > > > >
> > > > > Ben
> > > > >
> > > > > On Fri, 26 Jun 2020 at 11:48, Ben Stopford 
> wrote:
> > > > >
> > > > > > Hey folks
> > > > > >
> > > > > > We've made some updates to the website's look and feel. There is
> a
> > > > staged
> > > > > > version in the link below.
> > > > > >
> > > > > > https://ec2-13-57-18-236.us-west-1.compute.amazonaws.com/
> > > > > > username: kafka
> > > > > > password: streaming
> > > > > >
> > > > > > Comments welcomed.
> > > > > >
> > > > > > Ben
> > > > > >
> > > > > >
> > > >
> > >
> > >
> > > --
> > >
> > > Ben Stopford
> > >
> > > Lead Technologist, Office of the CTO
> > >
> > > 
> > >
> >
>


Re: Broker always out of ISRs

2020-04-01 Thread Ben Weintraub
I’ve had the same experience as Liam with this symptom (all followers on a
single broker of a given leader getting stuck). It sounds likely that
either the replica fetcher thread is getting stuck or dying with an
unhandled exception.

The the former case, jstack output can be helpful to understand why the
fetcher is stuck. There may or may not be a message in the broker logs on
the broker that’s failing to get in sync.

In the latter case, there should be evidence in the broker logs on the
broker that’s failing to get in sync (and the thread will be notably absent
in jstack output)

Ben

On Wed, Apr 1, 2020 at 5:40 PM Liam Clarke 
wrote:

> Hi Zach,
>
>  If you check the cluster's controller's controller.log, do you see broker
> 2 bouncing in and out of ISRs? There'll be logs to that effect. Or is it
> just never getting in-sync in the first place?
>
> Whenever I've had this issue in the past, it's been because the replica
> fetcher has died. Hate to say this, but have tried turning broker 2 on and
> off again? It's usually how I've resolved this issue when a broker won't
> stay in ISR. Also make sure that there's enough CPU/network on the machine
> it's running on - we've usually had this issue where CPU was very high or
> the network saturated.
>
> Cheers,
>
> Liam Clarke-Hutchinson
>
> On Thu, Apr 2, 2020 at 8:51 AM Zach Cox  wrote:
>
> > Hi Liam,
> >
> >
> > > Any issues with partitions broker 2 is leader of?
> > >
> >
> > Earlier today, broker 2 was not leader of any partitions. At that time, 2
> > appeared to be in ISRs of all partitions where 1 was leader, but 2 was
> not
> > in any ISRs of partitions where 0 was leader.
> >
> > Currently, broker 2 is leader of 55 partitions, but does not appear to be
> > in ISRs of any other partitions, whether 0 or 1 is leader.
> >
> >
> > > Also, have you checked b2's server.log?
> > >
> >
> > We don't see any logs that obviously indicate the problem, although we're
> > also not sure what things we should be looking for. There are a few
> > Zookeeper client timeouts, but haven't correlated that with anything yet.
> >
> > Thanks,
> > Zach
> >
>


Client issues following preferred replica elections

2019-07-01 Thread Ben Weintraub
Hi!

I help operate a Kafka cluster with a large number of clients (2-10k
established connections to the Kafka port at any given time on a given
broker).

When we take one of these brokers down for maintenance using a controlled
shutdown, all is fine. Bringing it back online is similarly fine - the
broker re-joins the cluster and gets back in sync quickly. However, when
initiating a preferred replica election, a minority of producers get stuck
in a state where they cannot produce to the restored broker, emitting
errors like this:

Expiring  record(s) for -:  ms has passed since
batch creation plus linger time

I realize that there are multiple ways of approaching this problem (client
vs. broker-side, app-level vs. system-level tuning, etc), but our current
situation is that we've got a lot more control over the brokers than the
clients, so I'm interested in focusing on what can be done broker-side to
make these events less impactful to clients. As an aside: our brokers are
currently on Kafka 0.10.2.1 (I know, we're working towards an upgrade, but
it's a ways out still), and most clients are on that same version of the
client libs.

To that end, I've been trying to understand what happens broker-side on the
restored broker immediately following the replica election, and I've found
two clues:

* The TcpExtListenOverflows and TcpExtListenDrops counters both spike
briefly
* The broker starts sending TCP SYN cookies to clients

Based on my reading (primarily this article:
https://blog.cloudflare.com/syn-packet-handling-in-the-wild/), it sounds
like these symptoms indicate that the SYN and/or accept queues are
overflowing. The sizes of those queues appear to be controlled via the
listen() call to configure the listening socket, which doesn't appear to be
configurable in Kafka.

I don't have evidence of this, but I'm speculating that the silent dropping
of SYN and/or ACK packets by brokers might be a triggering cause for the
hangs seen by some producers.

All of this makes me wonder a two things:

* Has anyone else seen issues with preferred replica elections causing
producers to get stuck like this? If so what remediations have folks put in
place for these issues?
* Is there a reason that the TCP accept backlog size in the brokers is not
configurable? It looks like it just inherits the default of 50 from the
JVM. It seems like bumping this queue size is the 'standard' advice given
for handling legitimate connection bursts more gracefully.

Thanks!
Ben