Re: New Producer API - batched sync mode support

2015-04-28 Thread Jay Kreps
Hey guys, The locking argument is correct for very small records (< 50 bytes), batching will help here because for small records locking becomes the big bottleneck. I think these use cases are rare but not unreasonable. Overall I'd emphasize that the new producer is way faster at virtually all us

Re: New Producer API - batched sync mode support

2015-04-28 Thread Ivan Balashov
I must agree with @Roshan – it's hard to imagine anything more intuitive and easy to use for atomic batching as old sync batch api. Also, it's fast. Coupled with a separate instance of producer per broker:port:topic:partition it works very well. I would be glad if it finds its way into new producer

Kafka 0.8.2 consumer offset checker throwing kafka.common.NotCoordinatorForConsumerException

2015-04-28 Thread Kartheek Karra
We recently upgraded kafka in our production environment cluster of 5 brokers from 0.8.0 to 0.8.2. Since then the consumerOffsetChecker script is unable to fetch offset due to kafka.common.NotCoordinatorForConsumerException. Note I'm able to run the 'consumerOffsetChecker' from an older version 0.8

Re: New Producer API - batched sync mode support

2015-04-28 Thread Roshan Naik
@Ewen No I did not use compression in my measurements.

Re: SimpleConsumer not fetching messages

2015-04-28 Thread Ivan Balashov
Does increasing PartitionFetchInfo.fetchSize help? Speaking of Kafka API, it looks like throwing exception would be less confusing if fetchSize is not enough to get at least one message at requested offset. 2015-04-28 21:12 GMT+03:00 Laran Evans : > I’ve got a simple consumer. According to GetOf

Re: New Producer API - batched sync mode support

2015-04-28 Thread Roshan Naik
@Joel, If flush() works for this use case it may be an acceptable starting point (although not as clean as a native batched sync). I am not as yet clear about some aspects of flush's batch semantics and its suitability for this mode of operation. Allow me explore it with you folks.. 1) flush() gu

Re: Unclaimed partitions

2015-04-28 Thread Dave Hamilton
1. We’re using version 0.8.1.1. 2. No failures in the consumer logs 3. We’re using the ConsumerOffsetChecker to see what partitions are assigned to the consumer group and what their offsets are. 8 of the 12 process each have been assigned two partitions and they’re keeping up with the topic. The

Writing Spark RDDs into Kafka

2015-04-28 Thread Ming Zhao
Hi, I wonder if anyone has a good example of how to write Spark RDDs into Kafka. Specifically, my question is if there is an advantage of sending a list of messages each time over sending one message at a time. Sample code for sending one message at a time: dStream.foreachRDD(rdd => { rdd.col

Re: New producer: metadata update problem on 2 Node cluster.

2015-04-28 Thread Ewen Cheslack-Postava
Ok, all of that makes sense. The only way to possibly recover from that state is either for K2 to come back up allowing the metadata refresh to eventually succeed or to eventually try some other node in the cluster. Reusing the bootstrap nodes is one possibility. Another would be for the client to

SimpleConsumer not fetching messages

2015-04-28 Thread Laran Evans
I’ve got a simple consumer. According to GetOffsetShell my offset is 209418. But the SimpleConsumer doesn’t get any messages past offset 123146. It just won’t pull down any messages after that offset. If I send more messages onto the topic it still won’t pull them down. Though the offset does in

RE: Unclaimed partitions

2015-04-28 Thread Aditya Auradkar
Couple of questions: - What version of the consumer API are you using? - Are you seeing any rebalance failures in the consumer logs? - How do you determine that some partitions are unassigned? Just confirming that you have partitions that are not being consumed from as opposed to consumer threads

Re: hive output to kafka

2015-04-28 Thread Gwen Shapira
Kind of what you need but not quiet: Sqoop2 is capable of getting data from HDFS to Kafka. AFAIK it doesn't support Hive queries, but feel free to open a JIRA for Sqoop :) Gwen On Tue, Apr 28, 2015 at 4:09 AM, Svante Karlsson wrote: > What's the best way of exporting contents (avro encoded) fr

Re: Topic missing Leader and Isr

2015-04-28 Thread Buntu Dev
Also note that the metadata for the topic is missing. I tried creating few more topics and all have the same issue. Using the Kafka console producer on the topic, I see these error messages indicating the missing metadata: WARN Error while fetching metadata [{TopicMetadata for topic my-topic -> N

Re: Unclaimed partitions

2015-04-28 Thread Dave Hamilton
I’m sorry, I forgot to specify that these processes are in the same consumer group. Thanks, Dave On 4/28/15, 1:15 PM, "Aditya Auradkar" wrote: >Hi Dave, > >The simple consumer doesn't do any state management across consumer instances. >So I'm not sure how you are assigning partitions in y

RE: Unclaimed partitions

2015-04-28 Thread Aditya Auradkar
Hi Dave, The simple consumer doesn't do any state management across consumer instances. So I'm not sure how you are assigning partitions in your application code. Did you mean to say that you are using the high level consumer API? Thanks, Aditya From: D

Re: Kafka commit offset

2015-04-28 Thread Jiangjie Qin
Yes, if you set the offset storage to Kafka, high level consumer will be using Kafka for all offset related operations. Jiangjie (Becket) Qin On 4/27/15, 7:03 PM, "Gomathivinayagam Muthuvinayagam" wrote: >I am trying to commit offset request in a background thread. I am able to >commit it so f

RE: Kafka - preventing message loss

2015-04-28 Thread Aditya Auradkar
You can use the min.insync.replicas topic level configuration in this case. It must be used with acks=-1 which is a producer config. http://kafka.apache.org/documentation.html#topic-config Aditya From: Gomathivinayagam Muthuvinayagam [sankarm...@gmail.co

Re: New producer: metadata update problem on 2 Node cluster.

2015-04-28 Thread Manikumar Reddy
Hi Ewen, Thanks for the response. I agree with you, In some case we should use bootstrap servers. > > If you have logs at debug level, are you seeing this message in between the > connection attempts: > > Give up sending metadata request since no node is available > Yes, this log came for co

zookeeper restart fatal error

2015-04-28 Thread Emley, Andrew
Hi I have had zk and kafka(2_8.0-0.8.1) set up nicely running for a week or so, I decided to stop the zk and the kafka brokers and re-start them, since stopping zk I can't start it again!! It gives me this fatal exception that is related to one of my test topics "multinode1partition4reptopic"!?

Re: Could you answer the following kafka stackoverflow question?

2015-04-28 Thread Gomathivinayagam Muthuvinayagam
If I use high level consumer API, and also offset storage as kafka, does high level consumer API takes care of everything? Thanks & Regards, On Tue, Apr 28, 2015 at 7:47 AM, Manoj Khangaonkar wrote: > Use the Simple Consumer API if you want to control which offset in a > partition the client

Unclaimed partitions

2015-04-28 Thread Dave Hamilton
Hi, I am trying to consume a 24-partition topic across 12 processes. Each process is using the simple consumer API, and each is being assigned two consumer threads. I have noticed when starting these processes that sometimes some of my processes are not being assigned any partitions, and no reba

Re: Could you answer the following kafka stackoverflow question?

2015-04-28 Thread Manoj Khangaonkar
Use the Simple Consumer API if you want to control which offset in a partition the client should read messages from. Use the High level consumer, if you don'nt care about offsets. Offsets are per partition and stored on zookeeper. regards On Tue, Apr 28, 2015 at 4:38 AM, Gomathivinayagam Muthuv

Re: Why fetching meta-data for topic is done three times?

2015-04-28 Thread Madhukar Bharti
Hi Zakee, Thanks for your reply. >message.send.max.retries 3 >retry.backoff.ms 100 >topic.metadata.refresh.interval.ms 600*1000 This is my properties. Regards, Madhukar On Tue, Apr 28, 2015 at 3:26 AM, Zakee wrote: > What values do you have for below properties? Or are these set to default

subscribe please

2015-04-28 Thread Manish Malhotra
subscribe

Kafka - preventing message loss

2015-04-28 Thread Gomathivinayagam Muthuvinayagam
I am trying to setup a cluster where messages should never be lost once it is published. Say if I have 3 brokers, and if I configure the replicas to be 3 also, and if I consider max failures as 1, and I can achieve the above requirement. But when I post a message, how do I prevent kafka from accept

Could you answer the following kafka stackoverflow question?

2015-04-28 Thread Gomathivinayagam Muthuvinayagam
I have just posted the following question in stackoverflow. Could you answer the following questions? I would like to use Kafka high level consumer API, and at the same time I would like to disable auto commit of offsets. I tried to achieve this through the following steps. 1) auto.commit.enable

Re: hive output to kafka

2015-04-28 Thread Harut Martirosyan
I may be wrong but try to use flume (http://flume.apache.org), just not sure if it has hive source. On 28 April 2015 at 15:09, Svante Karlsson wrote: > What's the best way of exporting contents (avro encoded) from hive queries > to kafka? > > Kind of camus, the other way around > > best regards

hive output to kafka

2015-04-28 Thread Svante Karlsson
What's the best way of exporting contents (avro encoded) from hive queries to kafka? Kind of camus, the other way around best regards svante