Hi all, Spark Streaming occasionally (not always) hangs indefinitely on my program right after the first batch has been processed. As you can see in the following screenshots of the Spark Streaming monitoring UI, it hangs on the map stages that correspond (I assume) to the second batch that is being processed. If I kill these hanging map stages, they are relaunched but will hang all the same. However, if I kill the whole program and restart it, it will usually be able to finish without any problem.
The program I am running is very simple : it streams from an hdfs directory, joins the data with a reference file (also stored on hdfs), performs some simple calculation to update a state by key, and finally prints the resulting Dstream. I am clueless as to why this happens seemingly "randomly" ; is it a bug, is it a configuration issue, or is it linked to my program ? Any hints ? <http://apache-spark-user-list.1001560.n3.nabble.com/file/n16829/38.png> <http://apache-spark-user-list.1001560.n3.nabble.com/file/n16829/51.png> -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-occasionally-hangs-after-processing-first-batch-tp16829.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org