Fwd: Spark streaming app that processes Kafka DStreams produces no output and no error

2017-01-20 Thread Shixiong(Ryan) Zhu
-- Forwarded message --
From: Shixiong(Ryan) Zhu 
Date: Fri, Jan 20, 2017 at 12:06 PM
Subject: Re: Spark streaming app that processes Kafka DStreams produces no
output and no error
To: shyla deshpande 


That's how KafkaConsumer works right now. It will retry forever for network
errors. See https://issues.apache.org/jira/browse/KAFKA-1894

On Thu, Jan 19, 2017 at 8:16 PM, shyla deshpande 
wrote:

> There was a issue connecting to Kafka, once that was fixed the spark app
> works.  Hope this helps someone.
> Thanks
>
> On Mon, Jan 16, 2017 at 7:58 AM, shyla deshpande  > wrote:
>
>> Hello,
>> I checked the log file on the worker node and don't see any error there.
>> This is the first time I am asked to run on such a small cluster.  I feel
>> its the resources issue, but it will be great help is somebody can confirm
>> this or share your experience. Thanks
>>
>> On Sat, Jan 14, 2017 at 4:01 PM, shyla deshpande <
>> deshpandesh...@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> I want to add that,
>>> I don't even see the streaming tab in the application UI on port 4040
>>> when I run it on the cluster.
>>> The cluster on EC2  has 1 master node and 1 worker node.
>>> The cores used on the worker node is 2 of 2 and memory used is 6GB of
>>> 6.3GB.
>>>
>>> Can I run a spark streaming job with just 2 cores?
>>>
>>> Appreciate your time and help.
>>>
>>> Thanks
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Jan 13, 2017 at 10:46 PM, shyla deshpande <
>>> deshpandesh...@gmail.com> wrote:
>>>
>>>> Hello,
>>>>
>>>> My spark streaming app that reads kafka topics and prints the DStream
>>>> works fine on my laptop, but on AWS cluster it produces no output and no
>>>> errors.
>>>>
>>>> Please help me debug.
>>>>
>>>> I am using Spark 2.0.2 and kafka-0-10
>>>>
>>>> Thanks
>>>>
>>>> The following is the output of the spark streaming app...
>>>>
>>>>
>>>> 17/01/14 06:22:41 WARN NativeCodeLoader: Unable to load native-hadoop 
>>>> library for your platform... using builtin-java classes where applicable
>>>> 17/01/14 06:22:43 WARN Checkpoint: Checkpoint directory check1 does not 
>>>> exist
>>>> Creating new context
>>>> 17/01/14 06:22:45 WARN SparkContext: Use an existing SparkContext, some 
>>>> configuration may not take effect.
>>>> 17/01/14 06:22:45 WARN KafkaUtils: overriding enable.auto.commit to false 
>>>> for executor
>>>> 17/01/14 06:22:45 WARN KafkaUtils: overriding auto.offset.reset to none 
>>>> for executor
>>>> 17/01/14 06:22:45 WARN KafkaUtils: overriding executor group.id to 
>>>> spark-executor-whilDataStream
>>>> 17/01/14 06:22:45 WARN KafkaUtils: overriding receive.buffer.bytes to 
>>>> 65536 see KAFKA-3135
>>>>
>>>>
>>>>
>>>
>>
>


Re: Spark streaming app that processes Kafka DStreams produces no output and no error

2017-01-19 Thread shyla deshpande
There was a issue connecting to Kafka, once that was fixed the spark app
works.  Hope this helps someone.
Thanks

On Mon, Jan 16, 2017 at 7:58 AM, shyla deshpande 
wrote:

> Hello,
> I checked the log file on the worker node and don't see any error there.
> This is the first time I am asked to run on such a small cluster.  I feel
> its the resources issue, but it will be great help is somebody can confirm
> this or share your experience. Thanks
>
> On Sat, Jan 14, 2017 at 4:01 PM, shyla deshpande  > wrote:
>
>> Hello,
>>
>> I want to add that,
>> I don't even see the streaming tab in the application UI on port 4040
>> when I run it on the cluster.
>> The cluster on EC2  has 1 master node and 1 worker node.
>> The cores used on the worker node is 2 of 2 and memory used is 6GB of
>> 6.3GB.
>>
>> Can I run a spark streaming job with just 2 cores?
>>
>> Appreciate your time and help.
>>
>> Thanks
>>
>>
>>
>>
>>
>> On Fri, Jan 13, 2017 at 10:46 PM, shyla deshpande <
>> deshpandesh...@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> My spark streaming app that reads kafka topics and prints the DStream
>>> works fine on my laptop, but on AWS cluster it produces no output and no
>>> errors.
>>>
>>> Please help me debug.
>>>
>>> I am using Spark 2.0.2 and kafka-0-10
>>>
>>> Thanks
>>>
>>> The following is the output of the spark streaming app...
>>>
>>>
>>> 17/01/14 06:22:41 WARN NativeCodeLoader: Unable to load native-hadoop 
>>> library for your platform... using builtin-java classes where applicable
>>> 17/01/14 06:22:43 WARN Checkpoint: Checkpoint directory check1 does not 
>>> exist
>>> Creating new context
>>> 17/01/14 06:22:45 WARN SparkContext: Use an existing SparkContext, some 
>>> configuration may not take effect.
>>> 17/01/14 06:22:45 WARN KafkaUtils: overriding enable.auto.commit to false 
>>> for executor
>>> 17/01/14 06:22:45 WARN KafkaUtils: overriding auto.offset.reset to none for 
>>> executor
>>> 17/01/14 06:22:45 WARN KafkaUtils: overriding executor group.id to 
>>> spark-executor-whilDataStream
>>> 17/01/14 06:22:45 WARN KafkaUtils: overriding receive.buffer.bytes to 65536 
>>> see KAFKA-3135
>>>
>>>
>>>
>>
>


Re: Spark streaming app that processes Kafka DStreams produces no output and no error

2017-01-16 Thread shyla deshpande
Hello,
I checked the log file on the worker node and don't see any error there.
This is the first time I am asked to run on such a small cluster.  I feel
its the resources issue, but it will be great help is somebody can confirm
this or share your experience. Thanks

On Sat, Jan 14, 2017 at 4:01 PM, shyla deshpande 
wrote:

> Hello,
>
> I want to add that,
> I don't even see the streaming tab in the application UI on port 4040 when
> I run it on the cluster.
> The cluster on EC2  has 1 master node and 1 worker node.
> The cores used on the worker node is 2 of 2 and memory used is 6GB of
> 6.3GB.
>
> Can I run a spark streaming job with just 2 cores?
>
> Appreciate your time and help.
>
> Thanks
>
>
>
>
>
> On Fri, Jan 13, 2017 at 10:46 PM, shyla deshpande <
> deshpandesh...@gmail.com> wrote:
>
>> Hello,
>>
>> My spark streaming app that reads kafka topics and prints the DStream
>> works fine on my laptop, but on AWS cluster it produces no output and no
>> errors.
>>
>> Please help me debug.
>>
>> I am using Spark 2.0.2 and kafka-0-10
>>
>> Thanks
>>
>> The following is the output of the spark streaming app...
>>
>>
>> 17/01/14 06:22:41 WARN NativeCodeLoader: Unable to load native-hadoop 
>> library for your platform... using builtin-java classes where applicable
>> 17/01/14 06:22:43 WARN Checkpoint: Checkpoint directory check1 does not exist
>> Creating new context
>> 17/01/14 06:22:45 WARN SparkContext: Use an existing SparkContext, some 
>> configuration may not take effect.
>> 17/01/14 06:22:45 WARN KafkaUtils: overriding enable.auto.commit to false 
>> for executor
>> 17/01/14 06:22:45 WARN KafkaUtils: overriding auto.offset.reset to none for 
>> executor
>> 17/01/14 06:22:45 WARN KafkaUtils: overriding executor group.id to 
>> spark-executor-whilDataStream
>> 17/01/14 06:22:45 WARN KafkaUtils: overriding receive.buffer.bytes to 65536 
>> see KAFKA-3135
>>
>>
>>
>


Re: Spark streaming app that processes Kafka DStreams produces no output and no error

2017-01-14 Thread shyla deshpande
Hello,

I want to add that,
I don't even see the streaming tab in the application UI on port 4040 when
I run it on the cluster.
The cluster on EC2  has 1 master node and 1 worker node.
The cores used on the worker node is 2 of 2 and memory used is 6GB of 6.3GB.

Can I run a spark streaming job with just 2 cores?

Appreciate your time and help.

Thanks





On Fri, Jan 13, 2017 at 10:46 PM, shyla deshpande 
wrote:

> Hello,
>
> My spark streaming app that reads kafka topics and prints the DStream
> works fine on my laptop, but on AWS cluster it produces no output and no
> errors.
>
> Please help me debug.
>
> I am using Spark 2.0.2 and kafka-0-10
>
> Thanks
>
> The following is the output of the spark streaming app...
>
>
> 17/01/14 06:22:41 WARN NativeCodeLoader: Unable to load native-hadoop library 
> for your platform... using builtin-java classes where applicable
> 17/01/14 06:22:43 WARN Checkpoint: Checkpoint directory check1 does not exist
> Creating new context
> 17/01/14 06:22:45 WARN SparkContext: Use an existing SparkContext, some 
> configuration may not take effect.
> 17/01/14 06:22:45 WARN KafkaUtils: overriding enable.auto.commit to false for 
> executor
> 17/01/14 06:22:45 WARN KafkaUtils: overriding auto.offset.reset to none for 
> executor
> 17/01/14 06:22:45 WARN KafkaUtils: overriding executor group.id to 
> spark-executor-whilDataStream
> 17/01/14 06:22:45 WARN KafkaUtils: overriding receive.buffer.bytes to 65536 
> see KAFKA-3135
>
>
>


Spark streaming app that processes Kafka DStreams produces no output and no error

2017-01-13 Thread shyla deshpande
Hello,

My spark streaming app that reads kafka topics and prints the DStream works
fine on my laptop, but on AWS cluster it produces no output and no errors.

Please help me debug.

I am using Spark 2.0.2 and kafka-0-10

Thanks

The following is the output of the spark streaming app...


17/01/14 06:22:41 WARN NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where
applicable
17/01/14 06:22:43 WARN Checkpoint: Checkpoint directory check1 does not exist
Creating new context
17/01/14 06:22:45 WARN SparkContext: Use an existing SparkContext,
some configuration may not take effect.
17/01/14 06:22:45 WARN KafkaUtils: overriding enable.auto.commit to
false for executor
17/01/14 06:22:45 WARN KafkaUtils: overriding auto.offset.reset to
none for executor
17/01/14 06:22:45 WARN KafkaUtils: overriding executor group.id to
spark-executor-whilDataStream
17/01/14 06:22:45 WARN KafkaUtils: overriding receive.buffer.bytes to
65536 see KAFKA-3135