Hi all,

Spark Streaming occasionally (not always) hangs indefinitely on my program
right after the first batch has been processed.
As you can see in the following screenshots of the Spark Streaming
monitoring UI, it hangs on the map stages that correspond (I assume) to the
second batch that is being processed.
If I kill these hanging map stages, they are relaunched but will hang all
the same. However, if I kill the whole program and restart it, it will
usually be able to finish without any problem.

The program I am running is very simple : it streams from an hdfs directory,
joins the data with a reference file (also stored on hdfs), performs some
simple calculation to update a state by key, and finally prints the
resulting Dstream.

I am clueless as to why this happens seemingly "randomly" ; is it a bug, is
it a configuration issue, or is it linked to my program ? Any hints ? 

<http://apache-spark-user-list.1001560.n3.nabble.com/file/n16829/38.png> 

<http://apache-spark-user-list.1001560.n3.nabble.com/file/n16829/51.png> 





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-occasionally-hangs-after-processing-first-batch-tp16829.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to