[ https://issues.apache.org/jira/browse/SPARK-21022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Colin Woodbury updated SPARK-21022: ----------------------------------- Description: A `RDD.foreach` or `RDD.foreachPartition` call will swallow Exceptions thrown inside its closure, but not if the exception was thrown earlier in the call chain. An example: {code:none} package examples import org.apache.spark._ object Shpark { def main(args: Array[String]) { implicit val sc: SparkContext = new SparkContext( new SparkConf().setMaster("local[*]").setAppName("blahfoobar") ) /* DOESN'T THROW sc.parallelize(0 until 10000000) .foreachPartition { _.map { i => println("BEFORE THROW") throw new Exception("Testing exception handling") println(i) }} */ /* DOESN'T THROW, nor does anything print. * Commenting out the exception runs the prints. * (i.e. `foreach` is sufficient to "run" an RDD) sc.parallelize(0 until 100000) .foreach({ i => println("BEFORE THROW") throw new Exception("Testing exception handling") println(i) }) */ /* Throws! */ sc.parallelize(0 until 100000) .map({ i => println("BEFORE THROW") throw new Exception("Testing exception handling") i }) .foreach(i => println(i)) println("JOB DONE!") System.in.read sc.stop() } } {code} When exceptions are swallowed, the jobs don't seem to fail, and the driver exits normally. When one _is_ thrown, as in the last example, the exception successfully rises up to the driver and can be caught with try/catch. The expected behaviour is for exceptions in `foreach` to throw and crash the driver, as they would with `map`. was: A `RDD.foreach` or `RDD.foreachPartition` call will swallow Exceptions thrown inside its closure, but not if the exception was thrown earlier in the call chain. An example: {code:none} package examples import org.apache.spark._ object Shpark { def main(args: Array[String]) { implicit val sc: SparkContext = new SparkContext( new SparkConf().setMaster("local[*]").setAppName("blahfoobar") ) /* DOESN'T THROW sc.parallelize(0 until 10000000) .foreachPartition { _.map { i => println("BEFORE THROW") throw new Exception("Testing exception handling") println(i) }} */ /* DOESN'T THROW, nor does anything print. * Commenting out the exception runs the prints. * (i.e. `foreach` is sufficient to "run" an RDD) sc.parallelize(0 until 100000) .foreach({ i => println("BEFORE THROW") throw new Exception("Testing exception handling") println(i) }) */ /* Throws! */ sc.parallelize(0 until 100000) .map({ i => println("BEFORE THROW") throw new Exception("Testing exception handling") i }) .foreach(i => println(i)) println("JOB DONE!") System.in.read sc.stop() } } {code} When exceptions are swallowed, the jobs don't seem to fail, and the driver exits normally. When one _is_ thrown, as in the last example, the exception successfully rises up to the driver and can be caught with `catch`. The expected behaviour is for exceptions in `foreach` to throw and crash the driver, as they would with `map`. > foreach swallows exceptions > --------------------------- > > Key: SPARK-21022 > URL: https://issues.apache.org/jira/browse/SPARK-21022 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 2.1.1 > Reporter: Colin Woodbury > Priority: Minor > > A `RDD.foreach` or `RDD.foreachPartition` call will swallow Exceptions thrown > inside its closure, but not if the exception was thrown earlier in the call > chain. An example: > {code:none} > package examples > import org.apache.spark._ > object Shpark { > def main(args: Array[String]) { > implicit val sc: SparkContext = new SparkContext( > new SparkConf().setMaster("local[*]").setAppName("blahfoobar") > ) > /* DOESN'T THROW > > sc.parallelize(0 until 10000000) > > .foreachPartition { _.map { i => > > println("BEFORE THROW") > > throw new Exception("Testing exception handling") > > println(i) > > }} > > */ > /* DOESN'T THROW, nor does anything print. > > * Commenting out the exception runs the prints. > > * (i.e. `foreach` is sufficient to "run" an RDD) > > sc.parallelize(0 until 100000) > > .foreach({ i => > > println("BEFORE THROW") > > throw new Exception("Testing exception handling") > > println(i) > > }) > > */ > /* Throws! */ > sc.parallelize(0 until 100000) > .map({ i => > println("BEFORE THROW") > throw new Exception("Testing exception handling") > i > }) > .foreach(i => println(i)) > println("JOB DONE!") > System.in.read > sc.stop() > } > } > {code} > When exceptions are swallowed, the jobs don't seem to fail, and the driver > exits normally. When one _is_ thrown, as in the last example, the exception > successfully rises up to the driver and can be caught with try/catch. > The expected behaviour is for exceptions in `foreach` to throw and crash the > driver, as they would with `map`. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org