Re: ReplicaFetcherThread Error, Massive Logging, and Leader Flapping

2015-06-10 Thread Kyle Banker
On Mon, Apr 20, 2015 at 3:46 PM, Kyle Banker wrote: > Hi Jiangjie, > > There's is nothing of note in the controller log. I've attached that log > along with the state change log in the following gist: > https://gist.github.com/banker/78b56a3a5246b25ace4c > >

Re: ReplicaFetcherThread Error, Massive Logging, and Leader Flapping

2015-04-20 Thread Kyle Banker
ntroller > log? > > Jiangjie (Becket) Qin > > > > On 4/16/15, 12:09 PM, "Kyle Banker" wrote: > > >Hi, > > > >I've run into a pretty serious production issue with Kafka 0.8.2, and I'm > >wondering what my options are. > > > >

ReplicaFetcherThread Error, Massive Logging, and Leader Flapping

2015-04-16 Thread Kyle Banker
Hi, I've run into a pretty serious production issue with Kafka 0.8.2, and I'm wondering what my options are. ReplicaFetcherThread Error I have a broker on a 9-node cluster that went down for a couple of hours. When it came back up, it started spewing constant errors of the following form: INFO

Re: Consumer Group Lag Reporting

2015-04-07 Thread Kyle Banker
lized Log Management > Solr & Elasticsearch Support * http://sematext.com/ > > > On Mon, Apr 6, 2015 at 6:14 PM, Kyle Banker wrote: > > > What is the best practice for reporting the lag on individual consumer > > groups (e.g., to Graphite)? > > > > A recent fo

Consumer Group Lag Reporting

2015-04-06 Thread Kyle Banker
What is the best practice for reporting the lag on individual consumer groups (e.g., to Graphite)? A recent form post (http://search-hadoop.com/m/4TaT4x9qWm1) seems to indicate that parsing the output of the consumer offset checker tool and reporting that independently is what folks do. Is there a

Re: kafka.server.ReplicaManager error

2015-02-05 Thread Kyle Banker
ster worked fine the week before during testing and also after during > more testing with rc3. > > > > > > > > 2015-02-05 19:22 GMT+01:00 Kyle Banker : > > > Digging in a bit more, it appears that the "down" broker had likely > > partially failed. Thu

Re: kafka.server.ReplicaManager error

2015-02-05 Thread Kyle Banker
Digging in a bit more, it appears that the "down" broker had likely partially failed. Thus, it was still attempting to fetch offsets that no longer exists. Does this make sense as an explanation of the above-mentioned behavior? On Thu, Feb 5, 2015 at 10:58 AM, Kyle Banker wrote: >

Re: kafka.server.ReplicaManager error

2015-02-05 Thread Kyle Banker
tions. This makes some sense, of course. Still, the log message doesn't. On Thu, Feb 5, 2015 at 10:39 AM, Kyle Banker wrote: > I have a 9-node Kafka cluster, and all of the brokers just started > spouting the following error: > > ERROR [Replica Manager on Broker 1]: Error when pr

kafka.server.ReplicaManager error

2015-02-05 Thread Kyle Banker
I have a 9-node Kafka cluster, and all of the brokers just started spouting the following error: ERROR [Replica Manager on Broker 1]: Error when processing fetch request for partition [mytopic,57] offset 0 from follower with correlation id 58166. Possible cause: Request for offset 0 but we only ha

Re: Missing Per-Topic BrokerTopicMetrics in v0.8.2.0

2015-01-27 Thread Kyle Banker
gt; >> > > > >> > > Thus, the topic metrics all look the same, and get lumped into the > >> > > top-level BrokerTopicMetrics (and thus that will now be double > >> counted). > >> > It > >> > > looks like the

Re: [VOTE] 0.8.2.0 Candidate 2 (with the correct links)

2015-01-26 Thread Kyle Banker
This is still preliminary, but it looks as if the change to metric names for per-topic metrics (bytes/messages in/out) is preventing these metrics from being reported to Yammer/Graphite. If this isn't intentional, it should probably be addressed before release. On Wed, Jan 21, 2015 at 9:28 AM, Jun

Re: Missing Per-Topic BrokerTopicMetrics in v0.8.2.0

2015-01-26 Thread Kyle Banker
ytesInPerSec,topic=test > > The full list of the refactored mbean names can be found at > http://kafka.apache.org/082/documentation.html#monitoring > > Thanks, > > Jun > > > On Mon, Jan 26, 2015 at 2:42 PM, Kyle Banker wrote: > > > I've been using a custom

Missing Per-Topic BrokerTopicMetrics in v0.8.2.0

2015-01-26 Thread Kyle Banker
I've been using a custom KafkaMetricsReporter to report Kafka broker metrics to Graphite. In v0.8.1.1, Kafka was reporting bytes and messages in and out for all topics together and for each individual topic. After upgrading to v0.8.2.0, these metrics are no longer being reported. I'm only seeing

Offset backup and restore in Kafka v0.8.2

2015-01-08 Thread Kyle Banker
We currently checkpoint partition offsets from ZK into an external database, and we have a process for restoring those offsets to ZK in case we ever need to replay data from Kafka. Since offsets are now stored on the broker, the ConsumerOffsetChecker no longer works. Is there a new way (tool) to d

Re: Achieving Consistency and Durability

2014-10-24 Thread Kyle Banker
27;ve heard rumors that you are very very good at documenting, so I'm > looking forward to your comments. > > Note that I'm completely ignoring the acks>1 case since we are about > to remove it. > > Gwen > > On Wed, Oct 15, 2014 at 1:21 PM, Kyle Banker wro

Re: Consistency and Availability on Node Failures

2014-10-16 Thread Kyle Banker
of box. You can achieve this using > the > > new java producer. The new java producer allows you to pick an arbitrary > > partition when sending a message. If you receive > NotEnoughReplicasException > > when sending a message, you can resend it to another partition. > &g

Re: Consistency and Availability on Node Failures

2014-10-15 Thread Kyle Banker
, will eventually find an available partition. But that would certainly be a costly way of finding an available partition. On Tue, Oct 14, 2014 at 3:05 PM, cac...@gmail.com wrote: > Wouldn't this work only for producers using random partitioning? > > On Tue, Oct 14, 2014 at 1:51 P

Re: Achieving Consistency and Durability

2014-10-15 Thread Kyle Banker
4, 2014 at 12:27 PM, Scott Reynolds > wrote: > > A question about 0.8.1.1 and acks. I was under the impression that > setting > > acks to 2 will not throw an exception when there is only one node in ISR. > > Am I incorrect ? Thus the need for min_isr. > > > > On Tue,

Consistency and Availability on Node Failures

2014-10-14 Thread Kyle Banker
Consider a 12-node Kafka cluster with a 200-parition topic having a replication factor of 3. Let's assume, in addition, that we're running Kafka v0.8.2, we've disabled unclean leader election, acks is -1, and min.isr is 2. Now suppose we lose 2 nodes. In this case, there's a good chance that 2/3 r

Achieving Consistency and Durability

2014-10-14 Thread Kyle Banker
It's quite difficult to infer from the docs the exact techniques required to ensure consistency and durability in Kafka. I propose that we add a doc section detailing these techniques. I would be happy to help with this. The basic question is this: assuming that I can afford to temporarily halt pr

Re: Replay Strategies

2014-09-25 Thread Kyle Banker
certain offset. > > On Wed, Sep 24, 2014 at 12:09:40PM -0600, Kyle Banker wrote: > > What are the best ways to replay a portion of a topic in the past? > > > > I'm using Kafka to process a real-time data stream. Suppose I discover > that > > all data inside of a six

Replay Strategies

2014-09-24 Thread Kyle Banker
What are the best ways to replay a portion of a topic in the past? I'm using Kafka to process a real-time data stream. Suppose I discover that all data inside of a six-hour window one day ago was incorrectly processed. Since I'm checkpointing offsets in an external database, I can create a new co

Re: Producer errors (failed to send producer request, failed to send requests for topics)

2014-09-23 Thread Kyle Banker
> > size on the broker (default to 1MB)? > > > > Thanks, > > > > Jun > > > > On Mon, Sep 22, 2014 at 3:41 PM, Kyle Banker > wrote: > > > >> I have a test data set of 1500 messages (~2.5 MB each) that I'm using to > >> test Kaf

Producer errors (failed to send producer request, failed to send requests for topics)

2014-09-22 Thread Kyle Banker
I have a test data set of 1500 messages (~2.5 MB each) that I'm using to test Kafka throughput. I'm pushing this data using 40 Kafka producers, and I'm losing about 10% of the message on each trial. I'm seeing errors of the following form: Failed to send producer request with correlation id 80 to

Which producer to use?

2014-06-23 Thread Kyle Banker
As of today, the latest Kafka docs show kafka.javaapi.producer.Producer in their example of the producer API ( https://kafka.apache.org/documentation.html#producerapi). Is there a reason why the latest producer client (org.apache.kafka.clients.producer.Producer) isn't mentioned? Is this client not

Re: Reliable Message Commits

2014-06-19 Thread Kyle Banker
ng if you found answer to your N-1 commit > > question? If auto commit happens only at iterator.next () and onky for > the > > N -1 message then client code can be much simpler and reliable as you > > mentioned. I'm also looking forward to any post in this regard. &

Re: Reliable Message Commits

2014-06-18 Thread Kyle Banker
rtition is reassigned. On Fri, Jun 13, 2014 at 3:01 PM, Kyle Banker wrote: > I'm using Kafka 0.8.1.1. > > I have a simple goal: use the high-level consumer to consume a message > from Kafka, publish the message to a different system, and then commit the > message in Kafka. Based on

Reliable Message Commits

2014-06-13 Thread Kyle Banker
I'm using Kafka 0.8.1.1. I have a simple goal: use the high-level consumer to consume a message from Kafka, publish the message to a different system, and then commit the message in Kafka. Based on my reading of the docs and the mailing list, it seems like this isn't so easy to achieve. Here is my