[GitHub] spark pull request #21463: [SPARK-23754][BRANCH-2.3][PYTHON] Re-raising Stop...

2018-05-30 Thread e-dorigatti
Github user e-dorigatti closed the pull request at:

https://github.com/apache/spark/pull/21463


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21463: [SPARK-23754][BRANCH-2.3][PYTHON] Re-raising Stop...

2018-05-30 Thread e-dorigatti
GitHub user e-dorigatti opened a pull request:

https://github.com/apache/spark/pull/21463

[SPARK-23754][BRANCH-2.3][PYTHON] Re-raising StopIteration in client code

## What changes are proposed
Make sure that `StopIteration`s raised in users' code do not silently 
interrupt processing by spark, but are raised as exceptions to the users. The 
users' functions are wrapped in `safe_iter` (in `shuffle.py`), which re-raises 
`StopIteration`s as `RuntimeError`s

## How were the changes tested
Unit tests, making sure that the exceptions are indeed raised. I am not 
sure how to check whether a `Py4JJavaError` contains my exception, so I simply 
looked for the exception message in the java exception's `toString`. Can you 
propose a better way?

This is my original work, licensed in the same way as spark

---

Author: e-dorigatti 

Closes #21383 from e-dorigatti/fix_spark_23754.

(cherry picked from commit 0ebb0c0d4dd3e192464dc5e0e6f01efa55b945ed)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/e-dorigatti/spark branch-2.3

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21463.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21463


commit 762893682e2bb1e7c5b065eab33e472660cdb4fb
Author: e-dorigatti 
Date:   2018-05-30T10:11:33Z

[SPARK-23754][PYTHON] Re-raising StopIteration in client code

Make sure that `StopIteration`s raised in users' code do not silently 
interrupt processing by spark, but are raised as exceptions to the users. The 
users' functions are wrapped in `safe_iter` (in `shuffle.py`), which re-raises 
`StopIteration`s as `RuntimeError`s

Unit tests, making sure that the exceptions are indeed raised. I am not 
sure how to check whether a `Py4JJavaError` contains my exception, so I simply 
looked for the exception message in the java exception's `toString`. Can you 
propose a better way?

This is my original work, licensed in the same way as spark

Author: e-dorigatti 

Closes #21383 from e-dorigatti/fix_spark_23754.

(cherry picked from commit 0ebb0c0d4dd3e192464dc5e0e6f01efa55b945ed)




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org