Fwd: Spark streaming app that processes Kafka DStreams produces no output and no error
-- Forwarded message -- From: Shixiong(Ryan) Zhu Date: Fri, Jan 20, 2017 at 12:06 PM Subject: Re: Spark streaming app that processes Kafka DStreams produces no output and no error To: shyla deshpande That's how KafkaConsumer works right now. It will retry forever for network errors. See https://issues.apache.org/jira/browse/KAFKA-1894 On Thu, Jan 19, 2017 at 8:16 PM, shyla deshpande wrote: > There was a issue connecting to Kafka, once that was fixed the spark app > works. Hope this helps someone. > Thanks > > On Mon, Jan 16, 2017 at 7:58 AM, shyla deshpande > wrote: > >> Hello, >> I checked the log file on the worker node and don't see any error there. >> This is the first time I am asked to run on such a small cluster. I feel >> its the resources issue, but it will be great help is somebody can confirm >> this or share your experience. Thanks >> >> On Sat, Jan 14, 2017 at 4:01 PM, shyla deshpande < >> deshpandesh...@gmail.com> wrote: >> >>> Hello, >>> >>> I want to add that, >>> I don't even see the streaming tab in the application UI on port 4040 >>> when I run it on the cluster. >>> The cluster on EC2 has 1 master node and 1 worker node. >>> The cores used on the worker node is 2 of 2 and memory used is 6GB of >>> 6.3GB. >>> >>> Can I run a spark streaming job with just 2 cores? >>> >>> Appreciate your time and help. >>> >>> Thanks >>> >>> >>> >>> >>> >>> On Fri, Jan 13, 2017 at 10:46 PM, shyla deshpande < >>> deshpandesh...@gmail.com> wrote: >>> >>>> Hello, >>>> >>>> My spark streaming app that reads kafka topics and prints the DStream >>>> works fine on my laptop, but on AWS cluster it produces no output and no >>>> errors. >>>> >>>> Please help me debug. >>>> >>>> I am using Spark 2.0.2 and kafka-0-10 >>>> >>>> Thanks >>>> >>>> The following is the output of the spark streaming app... >>>> >>>> >>>> 17/01/14 06:22:41 WARN NativeCodeLoader: Unable to load native-hadoop >>>> library for your platform... using builtin-java classes where applicable >>>> 17/01/14 06:22:43 WARN Checkpoint: Checkpoint directory check1 does not >>>> exist >>>> Creating new context >>>> 17/01/14 06:22:45 WARN SparkContext: Use an existing SparkContext, some >>>> configuration may not take effect. >>>> 17/01/14 06:22:45 WARN KafkaUtils: overriding enable.auto.commit to false >>>> for executor >>>> 17/01/14 06:22:45 WARN KafkaUtils: overriding auto.offset.reset to none >>>> for executor >>>> 17/01/14 06:22:45 WARN KafkaUtils: overriding executor group.id to >>>> spark-executor-whilDataStream >>>> 17/01/14 06:22:45 WARN KafkaUtils: overriding receive.buffer.bytes to >>>> 65536 see KAFKA-3135 >>>> >>>> >>>> >>> >> >
Re: Spark streaming app that processes Kafka DStreams produces no output and no error
There was a issue connecting to Kafka, once that was fixed the spark app works. Hope this helps someone. Thanks On Mon, Jan 16, 2017 at 7:58 AM, shyla deshpande wrote: > Hello, > I checked the log file on the worker node and don't see any error there. > This is the first time I am asked to run on such a small cluster. I feel > its the resources issue, but it will be great help is somebody can confirm > this or share your experience. Thanks > > On Sat, Jan 14, 2017 at 4:01 PM, shyla deshpande > wrote: > >> Hello, >> >> I want to add that, >> I don't even see the streaming tab in the application UI on port 4040 >> when I run it on the cluster. >> The cluster on EC2 has 1 master node and 1 worker node. >> The cores used on the worker node is 2 of 2 and memory used is 6GB of >> 6.3GB. >> >> Can I run a spark streaming job with just 2 cores? >> >> Appreciate your time and help. >> >> Thanks >> >> >> >> >> >> On Fri, Jan 13, 2017 at 10:46 PM, shyla deshpande < >> deshpandesh...@gmail.com> wrote: >> >>> Hello, >>> >>> My spark streaming app that reads kafka topics and prints the DStream >>> works fine on my laptop, but on AWS cluster it produces no output and no >>> errors. >>> >>> Please help me debug. >>> >>> I am using Spark 2.0.2 and kafka-0-10 >>> >>> Thanks >>> >>> The following is the output of the spark streaming app... >>> >>> >>> 17/01/14 06:22:41 WARN NativeCodeLoader: Unable to load native-hadoop >>> library for your platform... using builtin-java classes where applicable >>> 17/01/14 06:22:43 WARN Checkpoint: Checkpoint directory check1 does not >>> exist >>> Creating new context >>> 17/01/14 06:22:45 WARN SparkContext: Use an existing SparkContext, some >>> configuration may not take effect. >>> 17/01/14 06:22:45 WARN KafkaUtils: overriding enable.auto.commit to false >>> for executor >>> 17/01/14 06:22:45 WARN KafkaUtils: overriding auto.offset.reset to none for >>> executor >>> 17/01/14 06:22:45 WARN KafkaUtils: overriding executor group.id to >>> spark-executor-whilDataStream >>> 17/01/14 06:22:45 WARN KafkaUtils: overriding receive.buffer.bytes to 65536 >>> see KAFKA-3135 >>> >>> >>> >> >
Re: Spark streaming app that processes Kafka DStreams produces no output and no error
Hello, I checked the log file on the worker node and don't see any error there. This is the first time I am asked to run on such a small cluster. I feel its the resources issue, but it will be great help is somebody can confirm this or share your experience. Thanks On Sat, Jan 14, 2017 at 4:01 PM, shyla deshpande wrote: > Hello, > > I want to add that, > I don't even see the streaming tab in the application UI on port 4040 when > I run it on the cluster. > The cluster on EC2 has 1 master node and 1 worker node. > The cores used on the worker node is 2 of 2 and memory used is 6GB of > 6.3GB. > > Can I run a spark streaming job with just 2 cores? > > Appreciate your time and help. > > Thanks > > > > > > On Fri, Jan 13, 2017 at 10:46 PM, shyla deshpande < > deshpandesh...@gmail.com> wrote: > >> Hello, >> >> My spark streaming app that reads kafka topics and prints the DStream >> works fine on my laptop, but on AWS cluster it produces no output and no >> errors. >> >> Please help me debug. >> >> I am using Spark 2.0.2 and kafka-0-10 >> >> Thanks >> >> The following is the output of the spark streaming app... >> >> >> 17/01/14 06:22:41 WARN NativeCodeLoader: Unable to load native-hadoop >> library for your platform... using builtin-java classes where applicable >> 17/01/14 06:22:43 WARN Checkpoint: Checkpoint directory check1 does not exist >> Creating new context >> 17/01/14 06:22:45 WARN SparkContext: Use an existing SparkContext, some >> configuration may not take effect. >> 17/01/14 06:22:45 WARN KafkaUtils: overriding enable.auto.commit to false >> for executor >> 17/01/14 06:22:45 WARN KafkaUtils: overriding auto.offset.reset to none for >> executor >> 17/01/14 06:22:45 WARN KafkaUtils: overriding executor group.id to >> spark-executor-whilDataStream >> 17/01/14 06:22:45 WARN KafkaUtils: overriding receive.buffer.bytes to 65536 >> see KAFKA-3135 >> >> >> >
Re: Spark streaming app that processes Kafka DStreams produces no output and no error
Hello, I want to add that, I don't even see the streaming tab in the application UI on port 4040 when I run it on the cluster. The cluster on EC2 has 1 master node and 1 worker node. The cores used on the worker node is 2 of 2 and memory used is 6GB of 6.3GB. Can I run a spark streaming job with just 2 cores? Appreciate your time and help. Thanks On Fri, Jan 13, 2017 at 10:46 PM, shyla deshpande wrote: > Hello, > > My spark streaming app that reads kafka topics and prints the DStream > works fine on my laptop, but on AWS cluster it produces no output and no > errors. > > Please help me debug. > > I am using Spark 2.0.2 and kafka-0-10 > > Thanks > > The following is the output of the spark streaming app... > > > 17/01/14 06:22:41 WARN NativeCodeLoader: Unable to load native-hadoop library > for your platform... using builtin-java classes where applicable > 17/01/14 06:22:43 WARN Checkpoint: Checkpoint directory check1 does not exist > Creating new context > 17/01/14 06:22:45 WARN SparkContext: Use an existing SparkContext, some > configuration may not take effect. > 17/01/14 06:22:45 WARN KafkaUtils: overriding enable.auto.commit to false for > executor > 17/01/14 06:22:45 WARN KafkaUtils: overriding auto.offset.reset to none for > executor > 17/01/14 06:22:45 WARN KafkaUtils: overriding executor group.id to > spark-executor-whilDataStream > 17/01/14 06:22:45 WARN KafkaUtils: overriding receive.buffer.bytes to 65536 > see KAFKA-3135 > > >
Spark streaming app that processes Kafka DStreams produces no output and no error
Hello, My spark streaming app that reads kafka topics and prints the DStream works fine on my laptop, but on AWS cluster it produces no output and no errors. Please help me debug. I am using Spark 2.0.2 and kafka-0-10 Thanks The following is the output of the spark streaming app... 17/01/14 06:22:41 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 17/01/14 06:22:43 WARN Checkpoint: Checkpoint directory check1 does not exist Creating new context 17/01/14 06:22:45 WARN SparkContext: Use an existing SparkContext, some configuration may not take effect. 17/01/14 06:22:45 WARN KafkaUtils: overriding enable.auto.commit to false for executor 17/01/14 06:22:45 WARN KafkaUtils: overriding auto.offset.reset to none for executor 17/01/14 06:22:45 WARN KafkaUtils: overriding executor group.id to spark-executor-whilDataStream 17/01/14 06:22:45 WARN KafkaUtils: overriding receive.buffer.bytes to 65536 see KAFKA-3135