Re: Spark streaming job hangs

2015-12-01 Thread Archit Thakur
Which version of spark you are runinng? Have you created Kafka-Directstream
? I am asking coz you might / might not be using receivers.
Also, When you say hangs, you mean there is no other log after this and
process still up?
Or do you mean, it kept on adding the jobs but did nothing else. (I am
optimistic :) ).

On Tue, Dec 1, 2015 at 4:12 PM, Paul Leclercq 
wrote:

> You might not have enough cores to process data from Kafka
>
>
>> When running a Spark Streaming program locally, do not use “local” or
>> “local[1]” as the master URL. Either of these means that only one thread
>> will be used for running tasks locally. If you are using a input DStream
>> based on a receiver (e.g. sockets, Kafka, Flume, etc.), then the single
>> thread will be used to run the receiver, leaving no thread for processing
>> the received data. *Hence, when running locally, always use “local[n]”
>> as the master URL, ​*where n > number of receivers to run (see Spark
>> Properties for information on how to set the master).*
>
>
>
>  
> https://spark.apache.org/docs/latest/streaming-programming-guide.html#input-dstreams-and-receivers
> 
>
> 2015-12-01 7:13 GMT+01:00 Cassa L :
>
>> Hi,
>>  I am reading data from Kafka into spark. It runs fine for sometime but
>> then hangs forever with following output. I don't see and errors in logs.
>> How do I debug this?
>>
>> 2015-12-01 06:04:30,697 [dag-scheduler-event-loop] INFO
>> (Logging.scala:59) - Adding task set 19.0 with 4 tasks
>> 2015-12-01 06:04:30,872 [pool-13-thread-1] INFO  (Logging.scala:59) -
>> Disconnected from Cassandra cluster: APG DEV Cluster
>> 2015-12-01 06:04:35,060 [JobGenerator] INFO  (Logging.scala:59) - Added
>> jobs for time 1448949875000 ms
>> 2015-12-01 06:04:40,054 [JobGenerator] INFO  (Logging.scala:59) - Added
>> jobs for time 144894988 ms
>> 2015-12-01 06:04:45,034 [JobGenerator] INFO  (Logging.scala:59) - Added
>> jobs for time 1448949885000 ms
>> 2015-12-01 06:04:50,100 [JobGenerator] INFO  (Logging.scala:59) - Added
>> jobs for time 144894989 ms
>> 2015-12-01 06:04:55,064 [JobGenerator] INFO  (Logging.scala:59) - Added
>> jobs for time 1448949895000 ms
>> 2015-12-01 06:05:00,125 [JobGenerator] INFO  (Logging.scala:59) - Added
>> jobs for time 144894990 ms
>>
>>
>> Thanks
>> LCassa
>>
>
>
>
> --
>
> Paul Leclercq | Data engineer
>
>
>  paul.lecle...@tabmo.io  |  http://www.tabmo.fr/
>


Re: Spark streaming job hangs

2015-12-01 Thread Paul Leclercq
You might not have enough cores to process data from Kafka


> When running a Spark Streaming program locally, do not use “local” or
> “local[1]” as the master URL. Either of these means that only one thread
> will be used for running tasks locally. If you are using a input DStream
> based on a receiver (e.g. sockets, Kafka, Flume, etc.), then the single
> thread will be used to run the receiver, leaving no thread for processing
> the received data. *Hence, when running locally, always use “local[n]” as
> the master URL, ​*where n > number of receivers to run (see Spark
> Properties for information on how to set the master).*


 
https://spark.apache.org/docs/latest/streaming-programming-guide.html#input-dstreams-and-receivers


2015-12-01 7:13 GMT+01:00 Cassa L :

> Hi,
>  I am reading data from Kafka into spark. It runs fine for sometime but
> then hangs forever with following output. I don't see and errors in logs.
> How do I debug this?
>
> 2015-12-01 06:04:30,697 [dag-scheduler-event-loop] INFO
> (Logging.scala:59) - Adding task set 19.0 with 4 tasks
> 2015-12-01 06:04:30,872 [pool-13-thread-1] INFO  (Logging.scala:59) -
> Disconnected from Cassandra cluster: APG DEV Cluster
> 2015-12-01 06:04:35,060 [JobGenerator] INFO  (Logging.scala:59) - Added
> jobs for time 1448949875000 ms
> 2015-12-01 06:04:40,054 [JobGenerator] INFO  (Logging.scala:59) - Added
> jobs for time 144894988 ms
> 2015-12-01 06:04:45,034 [JobGenerator] INFO  (Logging.scala:59) - Added
> jobs for time 1448949885000 ms
> 2015-12-01 06:04:50,100 [JobGenerator] INFO  (Logging.scala:59) - Added
> jobs for time 144894989 ms
> 2015-12-01 06:04:55,064 [JobGenerator] INFO  (Logging.scala:59) - Added
> jobs for time 1448949895000 ms
> 2015-12-01 06:05:00,125 [JobGenerator] INFO  (Logging.scala:59) - Added
> jobs for time 144894990 ms
>
>
> Thanks
> LCassa
>



-- 

Paul Leclercq | Data engineer


 paul.lecle...@tabmo.io  |  http://www.tabmo.fr/