[jira] [Commented] (SPARK-21022) RDD.foreach swallows exceptions
[ https://issues.apache.org/jira/browse/SPARK-21022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043223#comment-16043223 ] Shixiong Zhu commented on SPARK-21022: -- Good catch... > RDD.foreach swallows exceptions > --- > > Key: SPARK-21022 > URL: https://issues.apache.org/jira/browse/SPARK-21022 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.1 >Reporter: Colin Woodbury >Priority: Minor > > A `RDD.foreach` or `RDD.foreachPartition` call will swallow Exceptions thrown > inside its closure, but not if the exception was thrown earlier in the call > chain. An example: > {code:none} > package examples > import org.apache.spark._ > object Shpark { >def main(args: Array[String]) { > implicit val sc: SparkContext = new SparkContext( >new SparkConf().setMaster("local[*]").setAppName("blahfoobar") > ) > /* DOESN'T THROW > > sc.parallelize(0 until 1000) > >.foreachPartition { _.map { i => > > println("BEFORE THROW") > > throw new Exception("Testing exception handling") > > println(i) > >}} > > */ > /* DOESN'T THROW, nor does anything print. > > * Commenting out the exception runs the prints. > > * (i.e. `foreach` is sufficient to "run" an RDD) > > sc.parallelize(0 until 10) > >.foreach({ i => > > println("BEFORE THROW") > > throw new Exception("Testing exception handling") > > println(i) > >}) > > */ > /* Throws! */ > sc.parallelize(0 until 10) >.map({ i => > println("BEFORE THROW") > throw new Exception("Testing exception handling") > i >}) >.foreach(i => println(i)) > println("JOB DONE!") > System.in.read > sc.stop() >} > } > {code} > When exceptions are swallowed, the jobs don't seem to fail, and the driver > exits normally. When one _is_ thrown, as in the last example, the exception > successfully rises up to the driver and can be caught with try/catch. > The expected behaviour is for exceptions in `foreach` to throw and crash the > driver, as they would with `map`. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21022) RDD.foreach swallows exceptions
[ https://issues.apache.org/jira/browse/SPARK-21022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043247#comment-16043247 ] Shixiong Zhu commented on SPARK-21022: -- By the way, `foreachPartition` doesn't have the issue. It's just because "Iterator.map" is lazy and you don't consume the Iterator. > RDD.foreach swallows exceptions > --- > > Key: SPARK-21022 > URL: https://issues.apache.org/jira/browse/SPARK-21022 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.1 >Reporter: Colin Woodbury >Assignee: Shixiong Zhu >Priority: Minor > > A `RDD.foreach` or `RDD.foreachPartition` call will swallow Exceptions thrown > inside its closure, but not if the exception was thrown earlier in the call > chain. An example: > {code:none} > package examples > import org.apache.spark._ > object Shpark { >def main(args: Array[String]) { > implicit val sc: SparkContext = new SparkContext( >new SparkConf().setMaster("local[*]").setAppName("blahfoobar") > ) > /* DOESN'T THROW > > sc.parallelize(0 until 1000) > >.foreachPartition { _.map { i => > > println("BEFORE THROW") > > throw new Exception("Testing exception handling") > > println(i) > >}} > > */ > /* DOESN'T THROW, nor does anything print. > > * Commenting out the exception runs the prints. > > * (i.e. `foreach` is sufficient to "run" an RDD) > > sc.parallelize(0 until 10) > >.foreach({ i => > > println("BEFORE THROW") > > throw new Exception("Testing exception handling") > > println(i) > >}) > > */ > /* Throws! */ > sc.parallelize(0 until 10) >.map({ i => > println("BEFORE THROW") > throw new Exception("Testing exception handling") > i >}) >.foreach(i => println(i)) > println("JOB DONE!") > System.in.read > sc.stop() >} > } > {code} > When exceptions are swallowed, the jobs don't seem to fail, and the driver > exits normally. When one _is_ thrown, as in the last example, the exception > successfully rises up to the driver and can be caught with try/catch. > The expected behaviour is for exceptions in `foreach` to throw and crash the > driver, as they would with `map`. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21022) RDD.foreach swallows exceptions
[ https://issues.apache.org/jira/browse/SPARK-21022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043266#comment-16043266 ] Shixiong Zhu commented on SPARK-21022: -- Wait. I also checked `foreach` method. It does throw the exception. It's probably just you missed the exception due to lots of logs output? > RDD.foreach swallows exceptions > --- > > Key: SPARK-21022 > URL: https://issues.apache.org/jira/browse/SPARK-21022 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.1 >Reporter: Colin Woodbury >Assignee: Shixiong Zhu >Priority: Minor > > A `RDD.foreach` or `RDD.foreachPartition` call will swallow Exceptions thrown > inside its closure, but not if the exception was thrown earlier in the call > chain. An example: > {code:none} > package examples > import org.apache.spark._ > object Shpark { >def main(args: Array[String]) { > implicit val sc: SparkContext = new SparkContext( >new SparkConf().setMaster("local[*]").setAppName("blahfoobar") > ) > /* DOESN'T THROW > > sc.parallelize(0 until 1000) > >.foreachPartition { _.map { i => > > println("BEFORE THROW") > > throw new Exception("Testing exception handling") > > println(i) > >}} > > */ > /* DOESN'T THROW, nor does anything print. > > * Commenting out the exception runs the prints. > > * (i.e. `foreach` is sufficient to "run" an RDD) > > sc.parallelize(0 until 10) > >.foreach({ i => > > println("BEFORE THROW") > > throw new Exception("Testing exception handling") > > println(i) > >}) > > */ > /* Throws! */ > sc.parallelize(0 until 10) >.map({ i => > println("BEFORE THROW") > throw new Exception("Testing exception handling") > i >}) >.foreach(i => println(i)) > println("JOB DONE!") > System.in.read > sc.stop() >} > } > {code} > When exceptions are swallowed, the jobs don't seem to fail, and the driver > exits normally. When one _is_ thrown, as in the last example, the exception > successfully rises up to the driver and can be caught with try/catch. > The expected behaviour is for exceptions in `foreach` to throw and crash the > driver, as they would with `map`. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21022) RDD.foreach swallows exceptions
[ https://issues.apache.org/jira/browse/SPARK-21022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043368#comment-16043368 ] Colin Woodbury commented on SPARK-21022: Ah ok, that makes sense for `foreachPartition`. And wouldn't you know, I retried my tests with `foreach`, and they _do_ throw now. I swear they weren't this morning :S Anyway, it looks like this isn't a bug after all. Thanks for the confirmation. > RDD.foreach swallows exceptions > --- > > Key: SPARK-21022 > URL: https://issues.apache.org/jira/browse/SPARK-21022 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.1 >Reporter: Colin Woodbury >Assignee: Shixiong Zhu >Priority: Minor > > A `RDD.foreach` or `RDD.foreachPartition` call will swallow Exceptions thrown > inside its closure, but not if the exception was thrown earlier in the call > chain. An example: > {code:none} > package examples > import org.apache.spark._ > object Shpark { >def main(args: Array[String]) { > implicit val sc: SparkContext = new SparkContext( >new SparkConf().setMaster("local[*]").setAppName("blahfoobar") > ) > /* DOESN'T THROW > > sc.parallelize(0 until 1000) > >.foreachPartition { _.map { i => > > println("BEFORE THROW") > > throw new Exception("Testing exception handling") > > println(i) > >}} > > */ > /* DOESN'T THROW, nor does anything print. > > * Commenting out the exception runs the prints. > > * (i.e. `foreach` is sufficient to "run" an RDD) > > sc.parallelize(0 until 10) > >.foreach({ i => > > println("BEFORE THROW") > > throw new Exception("Testing exception handling") > > println(i) > >}) > > */ > /* Throws! */ > sc.parallelize(0 until 10) >.map({ i => > println("BEFORE THROW") > throw new Exception("Testing exception handling") > i >}) >.foreach(i => println(i)) > println("JOB DONE!") > System.in.read > sc.stop() >} > } > {code} > When exceptions are swallowed, the jobs don't seem to fail, and the driver > exits normally. When one _is_ thrown, as in the last example, the exception > successfully rises up to the driver and can be caught with try/catch. > The expected behaviour is for exceptions in `foreach` to throw and crash the > driver, as they would with `map`. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org