[GitHub] spark pull request #21383: [SPARK-23754][Python] Re-raising StopIteration in...

e-dorigatti Thu, 24 May 2018 07:20:00 -0700

Github user e-dorigatti commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21383#discussion_r190603773
  
    --- Diff: python/pyspark/tests.py ---
    @@ -1246,6 +1277,25 @@ def test_pipe_unicode(self):
             result = rdd.pipe('cat').collect()
             self.assertEqual(data, result)
     
    +    def test_stopiteration_in_client_code(self):
    +
    +        def stopit(*x):
    +            raise StopIteration()
    +
    +        seq_rdd = self.sc.parallelize(range(10))
    +        keyed_rdd = self.sc.parallelize((x % 2, x) for x in range(10))
    +        exc = Py4JJavaError, RuntimeError
    --- End diff --
    
    Both of them can happen, depending on where the `StopIteration` is raised. 
Consider for example `RDD.reduce`: if the exception is raised when reducing 
inside a partition, the user will get a `Py4JJavaError`, but if the error is 
raised when reducing locally the results 
[here](https://github.com/e-dorigatti/spark/blob/fix_spark_23754/python/pyspark/rdd.py#L858),
 it will be a `RuntimeError` (the one we raise in `fail_on_stopiteration`)



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21383: [SPARK-23754][Python] Re-raising StopIteration in...

Reply via email to