Re: Multiple Kafka Receivers and Union

2014-09-25 Thread Matt Narrell
I suppose I have other problems as I can’t get the Scala example to work either. Puzzling, as I have literally coded like the examples (that are purported to work), but no luck. mn On Sep 24, 2014, at 11:27 AM, Tim Smith secs...@gmail.com wrote: Maybe differences between JavaPairDStream and

Re: Multiple Kafka Receivers and Union

2014-09-25 Thread Matt Narrell
Tim, I think I understand this now. I had a five node Spark cluster and a five partition topic, and I created five receivers. I found this: http://stackoverflow.com/questions/25785581/custom-receiver-stalls-worker-in-spark-streaming Indicating that if I use all my workers as receivers,

Re: Multiple Kafka Receivers and Union

2014-09-25 Thread Matt Narrell
Additionally, If I dial up/down the number of executor cores, this does what I want. Thanks for the extra eyes! mn On Sep 25, 2014, at 12:34 PM, Matt Narrell matt.narr...@gmail.com wrote: Tim, I think I understand this now. I had a five node Spark cluster and a five partition topic,

Re: Multiple Kafka Receivers and Union

2014-09-25 Thread Tim Smith
Good to know it worked out and thanks for the update. I didn't realize you need to provision for receiver workers + processing workers. One would think a worker would process multiple stages of an app/job and receive is just a stage of the job. On Thu, Sep 25, 2014 at 12:05 PM, Matt Narrell

Re: Multiple Kafka Receivers and Union

2014-09-24 Thread Matt Narrell
The part that works is the commented out, single receiver stream below the loop. It seems that when I call KafkaUtils.createStream more than once, I don’t receive any messages. I’ll dig through the logs, but at first glance yesterday I didn’t see anything suspect. I’ll have to look closer.

Re: Multiple Kafka Receivers and Union

2014-09-24 Thread Tim Smith
Maybe differences between JavaPairDStream and JavaPairReceiverInputDStream? On Wed, Sep 24, 2014 at 7:46 AM, Matt Narrell matt.narr...@gmail.com wrote: The part that works is the commented out, single receiver stream below the loop. It seems that when I call KafkaUtils.createStream more than

Multiple Kafka Receivers and Union

2014-09-23 Thread Matt Narrell
Hey, Spark 1.1.0 Kafka 0.8.1.1 Hadoop (YARN/HDFS) 2.5.1 I have a five partition Kafka topic. I can create a single Kafka receiver via KafkaUtils.createStream with five threads in the topic map and consume messages fine. Sifting through the user list and Google, I see that its possible to

Re: Multiple Kafka Receivers and Union

2014-09-23 Thread Tim Smith
Posting your code would be really helpful in figuring out gotchas. On Tue, Sep 23, 2014 at 9:19 AM, Matt Narrell matt.narr...@gmail.com wrote: Hey, Spark 1.1.0 Kafka 0.8.1.1 Hadoop (YARN/HDFS) 2.5.1 I have a five partition Kafka topic. I can create a single Kafka receiver via

Re: Multiple Kafka Receivers and Union

2014-09-23 Thread Tim Smith
Sorry, I am almost Java illiterate but here's my Scala code to do the equivalent (that I have tested to work): val kInStreams = (1 to 10).map{_ = KafkaUtils.createStream(ssc,zkhost.acme.net:2182,myGrp,Map(myTopic - 1), StorageLevel.MEMORY_AND_DISK_SER) } //Create 10 receivers across the cluster,

Re: Multiple Kafka Receivers and Union

2014-09-23 Thread Matt Narrell
To my eyes, these are functionally equivalent. I’ll try a Scala approach, but this may cause waves for me upstream (e.g., non-Java) Thanks for looking at this. If anyone else can see a glaring issue in the Java approach that would be appreciated. Thanks, Matt On Sep 23, 2014, at 4:13 PM,