Ok, I filed a jira for that.https://issues.apache.org/jira/browse/KAFKA-1273
-- Original --
From: "Jun Rao";
Date: 2014年2月14日(星期五) 晚上11:54
To: "users@kafka.apache.org";
Subject: Re: Surprisingly high network traffic between kafka servers
Hey, thanks so much for pointing this out. I think that this is likely
what is happening for us. I will attempt this fix.
Cheers,
Carl
On Thu, Feb 13, 2014 at 8:01 PM, zhong dong wrote:
> We encountered with this problem, too.
>
> And our problem is that we set the message.max.bytes larger than
Yeah that is a bug. We should be giving an error here rather than retrying.
-Jay
On Fri, Feb 14, 2014 at 7:54 AM, Jun Rao wrote:
> Hi, Zhong,
>
> Thanks for sharing this. We probably should add a sanity check in the
> broker to make sure that replica.fetch.max.bytes >= message.max.bytes.
> Cou
Hi, Zhong,
Thanks for sharing this. We probably should add a sanity check in the
broker to make sure that replica.fetch.max.bytes >= message.max.bytes.
Could you file a jira for that?
Jun
On Thu, Feb 13, 2014 at 8:01 PM, zhong dong wrote:
> We encountered with this problem, too.
>
> And our p
We encountered with this problem, too.
And our problem is that we set the message.max.bytes larger than
replica.fetch.max.bytes.
After we changed the replica.fetch.max.bytes to a larger number, the problem
solved.
Hey Joe,
Those periods with "no traffic" actually are periods of expected
traffic between nodes. It's just that the off period is so high that
the normal traffic is not visible. Also, once traffic goes crazy, the
only way to reset it is to stop all kafka nodes (vs do a rolling
restart).
I have be
Carl, looking at the boundary chart it looks like you have periods of no
traffic also... prior to the spikes.
I also noticed you are using AWS from your logs, what instance types are
you using? Do you have any network checks in place?
The logs show underReplication=true which leads towards what
One last thing, I have collected a snippet of the network traffic
between Kafka instances using tcpdump. However, it contains some
customer data and less than a minutes worth was over 1 GB, so I can't
really post it here, but I could possibly share offline if it can help
debug the issue.
On Thu, F
Re:
> Could you also check if the on-disk data size/rate match the network
> traffic?
While I have not explicitly checked this, I would say that the answer
is no. The network is over 1Gbps and I have setup monitoring for disk
space and nothing out of the norm is happening there. The expected
data
Ok, sorry for the lock of concrete information to help debug this
issue. I am not really an ops guy, so I am trying to keep up.
First, I added boundary to our servers. Normal Kafka behavior should
be resulting in 500 kbps or less on our cluster. Here you can see that
it's peaking at over 1 Gbps:
Could you also check if the on-disk data size/rate match the network
traffic?
Thanks,
Jun
On Thu, Feb 6, 2014 at 7:48 PM, Carl Lerche wrote:
> So, the "good news" is that the problem came back again. The bad news
> is that I disabled debug logs as it was filling disk (and I had other
> fires
So, if you start from scratch (new environment and download of the Kafka
release), could you post the list of steps to reproduce this issue?
On Thu, Feb 6, 2014 at 7:48 PM, Carl Lerche wrote:
> So, the "good news" is that the problem came back again. The bad news
> is that I disabled debug logs
So, the "good news" is that the problem came back again. The bad news
is that I disabled debug logs as it was filling disk (and I had other
fires to put out). I will re-enable debug logs and wait for it to
happen again.
On Thu, Feb 6, 2014 at 4:05 AM, Neha Narkhede wrote:
> Carl,
>
> It will help
Carl,
It will help if you can list the steps to reproduce this issue starting
from a fresh installation. Your setup, the way it stands, seems to have
gone through some config and state changes.
Thanks,
Neha
On Wed, Feb 5, 2014 at 5:17 PM, Joel Koshy wrote:
> On Wed, Feb 05, 2014 at 04:51:16PM
On Wed, Feb 05, 2014 at 04:51:16PM -0800, Carl Lerche wrote:
> So, I tried enabling debug logging, I also made some tweaks to the
> config (which I probably shouldn't have) and craziness happened.
>
> First, some more context. Besides the very high network traffic, we
> were seeing some other issu
So, I tried enabling debug logging, I also made some tweaks to the
config (which I probably shouldn't have) and craziness happened.
First, some more context. Besides the very high network traffic, we
were seeing some other issues that we were not focusing on yet.
* Even though the log retention w
Can you enable DEBUG logging in log4j and see what requests are coming in?
-Jay
On Tue, Feb 4, 2014 at 9:51 PM, Carl Lerche wrote:
> Hi Jay,
>
> I do not believe that I have changed the replica.fetch.wait.max.ms
> setting. Here I have included the kafka config as well as a snapshot
> of jnetto
I'm not really an ops person either. I was using jnettop for this.
On Wednesday, February 5, 2014, S Ahmed wrote:
> Sorry I'm not a ops person, but what tools do you use to monitor traffic
> between servers?
>
>
> On Tue, Feb 4, 2014 at 11:46 PM, Carl Lerche
> >
> wrote:
>
> > Hello,
> >
> > I'
Sorry I'm not a ops person, but what tools do you use to monitor traffic
between servers?
On Tue, Feb 4, 2014 at 11:46 PM, Carl Lerche wrote:
> Hello,
>
> I'm running a 0.8.0 Kafka cluster of 3 servers. The service that it is
> for is not in full production yet, so the data written to cluster i
Hi Jay,
I do not believe that I have changed the replica.fetch.wait.max.ms
setting. Here I have included the kafka config as well as a snapshot
of jnettop from one of the servers.
https://gist.github.com/carllerche/4f2cf0f0f6d1e891f482
The bottom row (89.9K/s) is the producer (it lives on a Kafk
No this is not normal.
Checking twice a second (using 500ms default) for new data shouldn't cause
high network traffic (that should be like < 1KB of overhead). I don't think
that explains things. Is it possible that setting has been overridden?
-Jay
On Tue, Feb 4, 2014 at 9:25 PM, Guozhang Wang
Hi Carl,
For each partition the follower will also fetch data from the leader
replica, even if there is no new data in the leader replicas.
One thing you can try to increase replica.fetch.wait.max.ms (default value
500ms) so that the followers's fetching request frequency to the leader can
be red
Hello,
I'm running a 0.8.0 Kafka cluster of 3 servers. The service that it is
for is not in full production yet, so the data written to cluster is
minimal (seems to average between 100kb/s -> 300kb/s per server). I
have configured Kafka to have a 3 replicas. I am noticing that each
Kafka server is
23 matches
Mail list logo