[jira] [Commented] (SPARK-21022) RDD.foreach swallows exceptions

2017-06-08 Thread Shixiong Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043223#comment-16043223
 ] 

Shixiong Zhu commented on SPARK-21022:
--

Good catch...

> RDD.foreach swallows exceptions
> ---
>
> Key: SPARK-21022
> URL: https://issues.apache.org/jira/browse/SPARK-21022
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.1
>Reporter: Colin Woodbury
>Priority: Minor
>
> A `RDD.foreach` or `RDD.foreachPartition` call will swallow Exceptions thrown 
> inside its closure, but not if the exception was thrown earlier in the call 
> chain. An example:
> {code:none}
>  package examples
>  import org.apache.spark._
>  object Shpark {
>def main(args: Array[String]) {
>  implicit val sc: SparkContext = new SparkContext(
>new SparkConf().setMaster("local[*]").setAppName("blahfoobar")
>  )
>  /* DOESN'T THROW 
> 
>  sc.parallelize(0 until 1000) 
> 
>.foreachPartition { _.map { i =>   
> 
>  println("BEFORE THROW")  
> 
>  throw new Exception("Testing exception handling")
> 
>  println(i)   
> 
>}} 
> 
>   */
>  /* DOESN'T THROW, nor does anything print.   
> 
>   * Commenting out the exception runs the prints. 
> 
>   * (i.e. `foreach` is sufficient to "run" an RDD)
> 
>  sc.parallelize(0 until 10)   
> 
>.foreach({ i =>
> 
>  println("BEFORE THROW")  
> 
>  throw new Exception("Testing exception handling")
> 
>  println(i)   
> 
>}) 
> 
>   */
>  /* Throws! */
>  sc.parallelize(0 until 10)
>.map({ i =>
>  println("BEFORE THROW")
>  throw new Exception("Testing exception handling")
>  i
>})
>.foreach(i => println(i))
>  println("JOB DONE!")
>  System.in.read
>  sc.stop()
>}
>  }
> {code}
> When exceptions are swallowed, the jobs don't seem to fail, and the driver 
> exits normally. When one _is_ thrown, as in the last example, the exception 
> successfully rises up to the driver and can be caught with try/catch.
> The expected behaviour is for exceptions in `foreach` to throw and crash the 
> driver, as they would with `map`.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21022) RDD.foreach swallows exceptions

2017-06-08 Thread Shixiong Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043247#comment-16043247
 ] 

Shixiong Zhu commented on SPARK-21022:
--

By the way, `foreachPartition` doesn't have the issue. It's just because 
"Iterator.map" is lazy and you don't consume the Iterator.

> RDD.foreach swallows exceptions
> ---
>
> Key: SPARK-21022
> URL: https://issues.apache.org/jira/browse/SPARK-21022
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.1
>Reporter: Colin Woodbury
>Assignee: Shixiong Zhu
>Priority: Minor
>
> A `RDD.foreach` or `RDD.foreachPartition` call will swallow Exceptions thrown 
> inside its closure, but not if the exception was thrown earlier in the call 
> chain. An example:
> {code:none}
>  package examples
>  import org.apache.spark._
>  object Shpark {
>def main(args: Array[String]) {
>  implicit val sc: SparkContext = new SparkContext(
>new SparkConf().setMaster("local[*]").setAppName("blahfoobar")
>  )
>  /* DOESN'T THROW 
> 
>  sc.parallelize(0 until 1000) 
> 
>.foreachPartition { _.map { i =>   
> 
>  println("BEFORE THROW")  
> 
>  throw new Exception("Testing exception handling")
> 
>  println(i)   
> 
>}} 
> 
>   */
>  /* DOESN'T THROW, nor does anything print.   
> 
>   * Commenting out the exception runs the prints. 
> 
>   * (i.e. `foreach` is sufficient to "run" an RDD)
> 
>  sc.parallelize(0 until 10)   
> 
>.foreach({ i =>
> 
>  println("BEFORE THROW")  
> 
>  throw new Exception("Testing exception handling")
> 
>  println(i)   
> 
>}) 
> 
>   */
>  /* Throws! */
>  sc.parallelize(0 until 10)
>.map({ i =>
>  println("BEFORE THROW")
>  throw new Exception("Testing exception handling")
>  i
>})
>.foreach(i => println(i))
>  println("JOB DONE!")
>  System.in.read
>  sc.stop()
>}
>  }
> {code}
> When exceptions are swallowed, the jobs don't seem to fail, and the driver 
> exits normally. When one _is_ thrown, as in the last example, the exception 
> successfully rises up to the driver and can be caught with try/catch.
> The expected behaviour is for exceptions in `foreach` to throw and crash the 
> driver, as they would with `map`.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21022) RDD.foreach swallows exceptions

2017-06-08 Thread Shixiong Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043266#comment-16043266
 ] 

Shixiong Zhu commented on SPARK-21022:
--

Wait. I also checked `foreach` method. It does throw the exception. It's 
probably just you missed the exception due to lots of logs output?

> RDD.foreach swallows exceptions
> ---
>
> Key: SPARK-21022
> URL: https://issues.apache.org/jira/browse/SPARK-21022
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.1
>Reporter: Colin Woodbury
>Assignee: Shixiong Zhu
>Priority: Minor
>
> A `RDD.foreach` or `RDD.foreachPartition` call will swallow Exceptions thrown 
> inside its closure, but not if the exception was thrown earlier in the call 
> chain. An example:
> {code:none}
>  package examples
>  import org.apache.spark._
>  object Shpark {
>def main(args: Array[String]) {
>  implicit val sc: SparkContext = new SparkContext(
>new SparkConf().setMaster("local[*]").setAppName("blahfoobar")
>  )
>  /* DOESN'T THROW 
> 
>  sc.parallelize(0 until 1000) 
> 
>.foreachPartition { _.map { i =>   
> 
>  println("BEFORE THROW")  
> 
>  throw new Exception("Testing exception handling")
> 
>  println(i)   
> 
>}} 
> 
>   */
>  /* DOESN'T THROW, nor does anything print.   
> 
>   * Commenting out the exception runs the prints. 
> 
>   * (i.e. `foreach` is sufficient to "run" an RDD)
> 
>  sc.parallelize(0 until 10)   
> 
>.foreach({ i =>
> 
>  println("BEFORE THROW")  
> 
>  throw new Exception("Testing exception handling")
> 
>  println(i)   
> 
>}) 
> 
>   */
>  /* Throws! */
>  sc.parallelize(0 until 10)
>.map({ i =>
>  println("BEFORE THROW")
>  throw new Exception("Testing exception handling")
>  i
>})
>.foreach(i => println(i))
>  println("JOB DONE!")
>  System.in.read
>  sc.stop()
>}
>  }
> {code}
> When exceptions are swallowed, the jobs don't seem to fail, and the driver 
> exits normally. When one _is_ thrown, as in the last example, the exception 
> successfully rises up to the driver and can be caught with try/catch.
> The expected behaviour is for exceptions in `foreach` to throw and crash the 
> driver, as they would with `map`.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21022) RDD.foreach swallows exceptions

2017-06-08 Thread Colin Woodbury (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043368#comment-16043368
 ] 

Colin Woodbury commented on SPARK-21022:


Ah ok, that makes sense for `foreachPartition`. And wouldn't you know, I 
retried my tests with `foreach`, and they _do_ throw now. I swear they weren't 
this morning :S

Anyway, it looks like this isn't a bug after all. Thanks for the confirmation.

> RDD.foreach swallows exceptions
> ---
>
> Key: SPARK-21022
> URL: https://issues.apache.org/jira/browse/SPARK-21022
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.1
>Reporter: Colin Woodbury
>Assignee: Shixiong Zhu
>Priority: Minor
>
> A `RDD.foreach` or `RDD.foreachPartition` call will swallow Exceptions thrown 
> inside its closure, but not if the exception was thrown earlier in the call 
> chain. An example:
> {code:none}
>  package examples
>  import org.apache.spark._
>  object Shpark {
>def main(args: Array[String]) {
>  implicit val sc: SparkContext = new SparkContext(
>new SparkConf().setMaster("local[*]").setAppName("blahfoobar")
>  )
>  /* DOESN'T THROW 
> 
>  sc.parallelize(0 until 1000) 
> 
>.foreachPartition { _.map { i =>   
> 
>  println("BEFORE THROW")  
> 
>  throw new Exception("Testing exception handling")
> 
>  println(i)   
> 
>}} 
> 
>   */
>  /* DOESN'T THROW, nor does anything print.   
> 
>   * Commenting out the exception runs the prints. 
> 
>   * (i.e. `foreach` is sufficient to "run" an RDD)
> 
>  sc.parallelize(0 until 10)   
> 
>.foreach({ i =>
> 
>  println("BEFORE THROW")  
> 
>  throw new Exception("Testing exception handling")
> 
>  println(i)   
> 
>}) 
> 
>   */
>  /* Throws! */
>  sc.parallelize(0 until 10)
>.map({ i =>
>  println("BEFORE THROW")
>  throw new Exception("Testing exception handling")
>  i
>})
>.foreach(i => println(i))
>  println("JOB DONE!")
>  System.in.read
>  sc.stop()
>}
>  }
> {code}
> When exceptions are swallowed, the jobs don't seem to fail, and the driver 
> exits normally. When one _is_ thrown, as in the last example, the exception 
> successfully rises up to the driver and can be caught with try/catch.
> The expected behaviour is for exceptions in `foreach` to throw and crash the 
> driver, as they would with `map`.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org