Re: How to prevent custom Partitioner from increasing the number of producer's requests?

2015-06-02 Thread Sebastien Falquier
Hi Jason, The default partitioner does not make the job since my producers haven't a smooth traffic. What I mean is that they can deliver lots of messages during 10 minutes and less during the next 10 minutes, that is too say the first partition will have stacked most of the messages of the last 2

Re: KafkaMetricsConfig not documented

2015-06-02 Thread Stevo Slavić
Created https://issues.apache.org/jira/browse/KAFKA-2244 On Mon, Jun 1, 2015 at 7:18 AM, Aditya Auradkar < aaurad...@linkedin.com.invalid> wrote: > Yeah, they aren't included in KafkaConfig for some reason but I think they > should. Can you file a jira? > > Aditya > >

Re: potential bug with offset request and just rolled log segment

2015-06-02 Thread Guozhang Wang
Alfred, As for 0.8.3, we are shooting for end of July: https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan Guozhang On Tue, Jun 2, 2015 at 8:43 AM, Alfred Landrum wrote: > I filed KAFKA-2236: > https://issues.apache.org/jira/browse/KAFKA-2236 > > Is there any guidance on whe

Re: HDD or SSD or EBS for kafka brokers in Amazon EC2

2015-06-02 Thread Theo Hultberg
Henry: We run Kafka on the old and trusty m1.xlarge. We avoid EBS completely, it's network storage that pretends to be local and when the network, which is AWS' weak spot, acts up EBS is a big liability. It's also slow and expensive. Others: Thanks for sharing your experience with the d2's. We hav

Re: Using SimpleConsumer to get messages from offset until now

2015-06-02 Thread luo.fucong
I think the SimpleConsumer Example(https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example ) in the wiki is a very good starting point. You can pass in the offset to the FetchRequest. And you

Re: HDD or SSD or EBS for kafka brokers in Amazon EC2

2015-06-02 Thread Henry Cai
Steven, Do you have the AWS case # (or the Ubuntu bug/case #) when you hit that kernel panic issue? Our company will still be running on AMI image 12.04 for a while, I will see whether the fix was also ported onto Ubuntu 12.04 On Tue, Jun 2, 2015 at 2:53 PM, Steven Wu wrote: > now I remember w

Re: HDD or SSD or EBS for kafka brokers in Amazon EC2

2015-06-02 Thread Steven Wu
now I remember we had same kernel panic issue in the first week of D2 rolling-out. then AWS fixed it and we haven't seen any issue since. try Ubuntu 14.04 and see if it resolves your remaining kernel/instability issue. On Tue, Jun 2, 2015 at 2:30 PM, Wes Chow wrote: > > Daniel Nelson > June

Re: HDD or SSD or EBS for kafka brokers in Amazon EC2

2015-06-02 Thread Wes Chow
Daniel Nelson June 2, 2015 at 4:39 PM On Jun 2, 2015, at 1:22 PM, Steven Wu wrote: can you elaborate what kind of instability you have encountered? We have seen the nodes become completely non-responsive. Usually they get rebooted automatically after 10-20 m

Re: HDD or SSD or EBS for kafka brokers in Amazon EC2

2015-06-02 Thread Daniel Nelson
On Jun 2, 2015, at 1:22 PM, Steven Wu wrote: > > can you elaborate what kind of instability you have encountered? We have seen the nodes become completely non-responsive. Usually they get rebooted automatically after 10-20 minutes, but occasionally they get stuck for days in a state where they

Re: HDD or SSD or EBS for kafka brokers in Amazon EC2

2015-06-02 Thread Steven Wu
Wes/Daniel, can you elaborate what kind of instability you have encountered? we are on Ubuntu 14.04.2 and haven't encountered any issues so far. in the announcement, they did mention using Ubuntu 14.04 for better disk throughput. not sure whether 14.04 also addresses any instability issue you enc

Consumer lag lies - orphaned offsets?

2015-06-02 Thread Otis Gospodnetic
Hi, I've noticed that when we restart our Kafka consumers our consumer lag metric sometimes looks "weird". Here's an example: https://apps.sematext.com/spm-reports/s/0Hq5zNb4hH You can see lag go up around 15:00, when some consumers were restarted. The "weird" thing is that the lag remains flat!

Re: HDD or SSD or EBS for kafka brokers in Amazon EC2

2015-06-02 Thread Wes Chow
Our workaround is to switch to i2's. Amazon didn't mention anything, though we're getting on a call with them soon so I'll be sure to ask. Fwiw, we're also on 12.04. Wes Daniel Nelson June 2, 2015 at 2:42 PM Do you have any workarounds for the d2 issues?

Re: HDD or SSD or EBS for kafka brokers in Amazon EC2

2015-06-02 Thread Daniel Nelson
> On Jun 2, 2015, at 10:39 AM, Wes Chow wrote: > > > We have run d2 instances with Kafka. They're currently unstable -- Amazon > confirmed a host issue with d2 instances that gets tickled by a Kafka > workload yesterday. Otherwise, it seems the d2 instance type is ideal as it > gets an enormo

Re: Kafka JMS metrics meaning

2015-06-02 Thread Marina
Thanks a lot to everybody for your suggestions!  In addition to the Consumer lag (on the Consumers side though), under-replicated partitions, offline partitions, active controller count, I am also thinking of monitoring the total size of partitions to not exceed some MAX (like 10G, for example)

Re: HDD or SSD or EBS for kafka brokers in Amazon EC2

2015-06-02 Thread Wes Chow
We have run d2 instances with Kafka. They're currently unstable -- Amazon confirmed a host issue with d2 instances that gets tickled by a Kafka workload yesterday. Otherwise, it seems the d2 instance type is ideal as it gets an enormous amount of disk throughput and you'll likely be network b

Re: Kafka JMS metrics meaning

2015-06-02 Thread Todd Palino
Under replicated is a must. Offline partitions is also good to monitor. We also use the active controller metric (it's 1 or 0) in aggregate for a cluster to know that the controller is running somewhere. For more general metrics, all topics bytes in and bytes out is good. We also watch the lea

Re: Offset management: client vs broker side responsibility

2015-06-02 Thread Otis Gospodnetic
Hi, I haven't followed the changes to offset tracking closely, other than that storing them in ZK is not the only option any more. I think what Stevo is asking about/suggesting is that there there be a single API from which offset information can be retrieved (e.g. by monitoring tools), so that mo

Re: HDD or SSD or EBS for kafka brokers in Amazon EC2

2015-06-02 Thread Steven Wu
EBS (network attached storage) has got a lot better over the last a few years. we don't quite trust it for kafka workload. At Netflix, we were going with the new d2 instance type (HDD). our perf/load testing shows it satisfy our workload. SSD is better in latency curve but pretty comparable in ter

RE: Kafka JMS metrics meaning

2015-06-02 Thread Aditya Auradkar
Number of underreplicated partitions, total request time are some good bets. Aditya From: Otis Gospodnetic [otis.gospodne...@gmail.com] Sent: Tuesday, June 02, 2015 9:56 AM To: users@kafka.apache.org; Marina Subject: Re: Kafka JMS metrics meaning Hi, On

Re: Kafka JMS metrics meaning

2015-06-02 Thread Otis Gospodnetic
Hi, On Tue, Jun 2, 2015 at 12:50 PM, Marina wrote: > Hi, > I have enabled JMX_PORT for KAfka server and am trying to understand some > of the metrics that are being exposed. I have two questions: > 1. what are the best metrics to monitor to quickly spot unhealthy Kafka > cluster? > People l

Kafka JMS metrics meaning

2015-06-02 Thread Marina
Hi, I have enabled JMX_PORT for KAfka server and am trying to understand some of the metrics that are being exposed. I have two questions: 1. what are the best metrics to monitor to quickly spot unhealthy Kafka cluster? 2. what do these metrics mean: ReplicaManager -> LeaderCount ? and ReplicaMa

HDD or SSD or EBS for kafka brokers in Amazon EC2

2015-06-02 Thread Henry Cai
We have been hosting kafka brokers in Amazon EC2 and we are using EBS disk. But periodically we were hit by long I/O wait time on EBS in some Availability Zones. We are thinking to change the instance types to a local HDD or local SSD. HDD is cheaper and bigger and seems quite fit for the Kafka u

Using SimpleConsumer to get messages from offset until now

2015-06-02 Thread Kevin Sjöberg
Hello, I'm trying to create a custom consumer that given a offset returns all messages until now. After this is done, the consumer is not needed anymore, hence, the consumer does not have to continue consuming messages that are being produced. The Kafka cluster exists of one broker and we only us

Re: potential bug with offset request and just rolled log segment

2015-06-02 Thread Alfred Landrum
I filed KAFKA-2236: https://issues.apache.org/jira/browse/KAFKA-2236 Is there any guidance on when 0.8.3 might be released?

Re: How to prevent custom Partitioner from increasing the number of producer's requests?

2015-06-02 Thread Jason Rosenberg
Hi Sebastien, You might just try using the default partitioner (which is random). It works by choosing a random partition each time it re-polls the meta-data for the topic. By default, this happens every 10 minutes for each topic you produce to (so it evenly distributes load at a granularity of

RE: leader update partitions fail with KeeperErrorCode = BadVersion,kafka version=0.8.1.1

2015-06-02 Thread chenlax
i create a topic with 72 partitions 2 replicas,then increased to 108,and the cluster is run ok. some days later i find the topic has 2 partition which ISR only include leader,check follow log-segment with partitions,the log-segment does not later. and i can not find more useful logs from kafka l

How to prevent custom Partitioner from increasing the number of producer's requests?

2015-06-02 Thread Sebastien Falquier
Hi guys, I am new to Kafka and I am facing a problem I am not able to sort out. To smooth traffic over all my brokers' partitions, I have coded a custom Paritioner for my producers, using a simple round robin algorithm that jumps from a partition to another on every batch of messages (correspondi

Re: Kafka partitions unbalanced

2015-06-02 Thread Vijay Patil
I ran into similar issue. I configured 3 disks, but partitions were allocated only to 2 disks (disk2 and disk3). Then I found that the left out disk (disk1) was already hosting lot number of other partitions from different topics. So may be partition allocation happens based on "how many partitions