article: hands-on kafka: dynamic DNS

2015-04-24 Thread Pierre-Yves Ritschard
Hi list! I just wanted to mention a small article I put together to describe an approach to leverage log compaction when you have compound types and messages are operations on that compound type with an example use-case: http://spootnik.org/entries/2015/04/23_hands-on-kafka-dynamic-dns.html

Atomic write of message batch to single partition

2015-04-24 Thread Martin Krasser
Hello, I'm using Kafka 0.8.2.1 (in a Scala/Java project) and trying to find out how to atomically write n messages (message batch) to a single topic partition. Is there any client API that gives such a guarantee? I couldn't find a clear answer reading the documentation, API docs (of the old

[ANN] Apache Cloudstack 4.5 kafka-event-bus plugin

2015-04-24 Thread Pierre-Yves Ritschard
Hi list, I thought I'd also mention that the next release of Apache Cloudstack adds the ability to publish all events happening on throughout the environment to kafka. Events are published as JSON. http://cloudstack-administration.readthedocs.org/en/latest/events.html#kafka-configuration

Re: Why fetching meta-data for topic is done three times?

2015-04-24 Thread Madhukar Bharti
Hi All, Once gone through code found that, While Producer starts it does three things: 1. Sends Meta-data request 2. Send message to broker(fetching broker list) 3. If number of message to be produce is grater than 0 then again tries to refresh metadata for outstanding produce requests. Each of

Does log.retention.bytes apply only to partition leader or also replicas

2015-04-24 Thread David Corley
Does the byte retention policy apply to replica partitions or leader partitions or both? In a multi-node cluster, with all brokers configured configured with different retention policies, it seems obvious that the partitions for which a given broker is a leader will be subject to the byte

Re: [KIP-DISCUSSION] KIP-22 Expose a Partitioner interface in the new producer

2015-04-24 Thread Gianmarco De Francisci Morales
Hi, Here are the questions I think we should consider: 1. Do we need this at all given that we have the partition argument in ProducerRecord which gives full control? I think we do need it because this is a way to plug in a different partitioning strategy at run time and do it in a fairly

Re: Kafka server - conflicted ephemeral node

2015-04-24 Thread 小宇
Is there a recommended way to handle this issue? Thanks! Mayuresh Gharat gharatmayures...@gmail.com于2015年4月22日星期三写道: This happens due to a bug in zookeeper, sometimes the znode does not get deleted automatically.We have seen it many times at Linkedin and are trying to investigate further.

New Java Producer: Single Producer vs multiple Producers

2015-04-24 Thread Manikumar Reddy
We have a 2 node cluster with 100 topics. should we use a single producer for all topics or create multiple producers? What is the best choice w.r.t network load/failures, node failures, latency, locks? Regards, Manikumar

Re: New Java Producer: Single Producer vs multiple Producers

2015-04-24 Thread Navneet Gupta (Tech - BLR)
Hi, I ran some tests on our cluster by sending message from multiple clients (machines). Each machine had about 40-100 threads per producer. I thought of trying out having multiple producers per clients with each producer receiving messages from say 10-15 threads. I actually did see an increase

Re: New Java Producer: Single Producer vs multiple Producers

2015-04-24 Thread Roshan Naik
Jay, Its not evident how to switch between sync and async modes using this new 'org.apache.kafka.clients.tools.ProducerPerformance' AFAIKT it measures in async mode by default. -roshan On 4/24/15 3:23 PM, Jay Kreps jay.kr...@gmail.com wrote: That should work. I recommend using the

Re: New Java Producer: Single Producer vs multiple Producers

2015-04-24 Thread Jay Kreps
That should work. I recommend using the performance tool cited in the blog linked from the performance page of the website. That tool is more accurate and uses the new producer. On Fri, Apr 24, 2015 at 2:29 PM, Roshan Naik ros...@hortonworks.com wrote: Can we use the new 0.8.2 producer perf

New and old producers partition messages differently

2015-04-24 Thread James Cheng
Hi, I was playing with the new producer in 0.8.2.1 using partition keys (semantic partitioning I believe is the phrase?). I noticed that the default partitioner in 0.8.2.1 does not partition items the same way as the old 0.8.1.1 default partitioner was doing. For a test item, the old producer

New producer: metadata update problem on 2 Node cluster.

2015-04-24 Thread Manikumar Reddy
We are testing new producer on a 2 node cluster. Under some node failure scenarios, producer is not able to update metadata. Steps to reproduce 1. form a 2 node cluster (K1, K2) 2. create a topic with single partition, replication factor = 2 3. start producing data (producer metadata : K1,K2) 2.

Re: New Java Producer: Single Producer vs multiple Producers

2015-04-24 Thread Jay Kreps
If you are talking about within a single process, having one producer is generally the fastest because batching dramatically reduces the number of requests (esp using the new java producer). -Jay On Fri, Apr 24, 2015 at 4:54 AM, Manikumar Reddy manikumar.re...@gmail.com wrote: We have a 2 node

Consumer members do not own any partitions in consumer group

2015-04-24 Thread Bryan Baugher
Hi everyone, We are running Kafka 0.8.1.1 with Storm. We wrote our own spout which uses the high level consumer API. Our setup is to create 4 spouts per worker. If your not familiar with Storm its basically 4 kafka consumers per java process. This particular consumer group is interested in 20

Re: kafka user group in los angeles

2015-04-24 Thread Jon Bringhurst
Hey Alex, It looks like this group might be appropriate to have a Kafka talk at: http://www.meetup.com/Los-Angeles-Big-Data-Users-Group/ It might be worth showing up at one of their events and asking around. -Jon On Thu, Apr 23, 2015 at 11:40 AM, Alex Toth a...@purificator.net wrote: Hi,

Re: New Java Producer: Single Producer vs multiple Producers

2015-04-24 Thread Manikumar Reddy
Hi Jay, Yes, we are producing from single process/jvm. From docs The producer will attempt to batch records together into fewer requests whenever multiple records are being sent to the same partition. If I understand correctly, batching happens at topic/partition level, not at Node level.

Re: kafka user group in los angeles

2015-04-24 Thread Alex Toth
Thanks.  I'll see what I can find.   alex From: Jon Bringhurst j...@bringhurst.org To: users@kafka.apache.org; Alex Toth a...@purificator.net Sent: Friday, April 24, 2015 9:51 AM Subject: Re: kafka user group in los angeles Hey Alex, It looks like this group might be appropriate

RE: kafka user group in los angeles

2015-04-24 Thread Jeff Field
If you don't mind venturing further south, http://www.meetup.com/OCBigData/ could be a good meetup to discuss Kafka at as well. -Original Message- From: Alex Toth [mailto:a...@purificator.net] Sent: Friday, April 24, 2015 9:55 AM To: Jon Bringhurst; users@kafka.apache.org Subject: Re:

Getting java.lang.IllegalMonitorStateException in mirror maker when building fetch request

2015-04-24 Thread tao xiao
Hi team, I observed java.lang.IllegalMonitorStateException thrown from AbstractFetcherThread in mirror maker when it is trying to build the fetchrequst. Below is the error [2015-04-23 16:16:02,049] ERROR [ConsumerFetcherThread-group_id_localhost-1429830778627-4519368f-0-7], Error due to

Re: New Java Producer: Single Producer vs multiple Producers

2015-04-24 Thread Roshan Naik
Yes, I too notice the same behavior (with producer/consumer perf tool on 8.1.2) Š adding more threads indeed improved the perf a lot (both with and without --sync). in --sync mode batch size made almost no diff, larger events improved the perf. I was doing some 8.1.2 perf testing with a 1 node

Re: Consumer members do not own any partitions in consumer group

2015-04-24 Thread Bryan Baugher
Managed to figure this one out myself. This is due to the range partition assignment in 0.8.1.1 and the fact each of our topics has 8 partitions so only the first 8 consumers get assigned anything. Looks like 0.8.2.0 has a round robin assignment which is what we want. On Fri, Apr 24, 2015 at

Kafka dependencies on Pig and Avro

2015-04-24 Thread Carita Ou
Hi, Im new to kafka and noticed that kafka has dependencies on older versions of Avro (1.4.0) and Pig (0.8.0); is there a reason for not moving to the latest (avro 1.7.7 and pig 0.14.0)? Also, kafka-hadoop-producer has dependencies on different versions of pig, pig-0.8.0 and piggybank-0.12.0;

Re: New Java Producer: Single Producer vs multiple Producers

2015-04-24 Thread Jay Kreps
Do make sure if you are at all performance sensitive you are using the new producer api we released in 0.8.2. -Jay On Fri, Apr 24, 2015 at 12:46 PM, Roshan Naik ros...@hortonworks.com wrote: Yes, I too notice the same behavior (with producer/consumer perf tool on 8.1.2) Š adding more threads

Re: New Java Producer: Single Producer vs multiple Producers

2015-04-24 Thread Roshan Naik
Can we use the new 0.8.2 producer perf tool against a 0.8.1 broker ? -roshan On 4/24/15 1:19 PM, Jay Kreps jay.kr...@gmail.com wrote: Do make sure if you are at all performance sensitive you are using the new producer api we released in 0.8.2. -Jay On Fri, Apr 24, 2015 at 12:46 PM, Roshan

leader election rate

2015-04-24 Thread Wesley Chow
Looking at the output from the jmx stats from our Kafka cluster, I see a more or less constant leader election rate of around 2.5 from our controller. Is this expected, or does this mean that leaders are shifting around constantly? If they are shifting, how should I go about debugging, and what