Github user e-dorigatti commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21383#discussion_r191113977
  
    --- Diff: python/pyspark/util.py ---
    @@ -89,6 +93,33 @@ def majorMinorVersion(sparkVersion):
                                  " version numbers.")
     
     
    +def fail_on_stopiteration(f):
    +    """
    +    Wraps the input function to fail on 'StopIteration' by raising a 
'RuntimeError'
    +    prevents silent loss of data when 'f' is used in a for loop
    +    """
    +    def wrapper(*args, **kwargs):
    +        try:
    +            return f(*args, **kwargs)
    +        except StopIteration as exc:
    +            raise RuntimeError(
    +                "Caught StopIteration thrown from user's code; failing the 
task",
    +                exc
    +            )
    +
    +    # prevent inspect to fail
    +    # e.g. inspect.getargspec(sum) raises
    +    # TypeError: <built-in function sum> is not a Python function
    +    try:
    +        argspec = _get_argspec(f)
    --- End diff --
    
    You said to do it in `udf.UserDefinedFunction._create_judf`, but sent the 
code of `udf._create_udf`. I assume you meant the former, since we cannot do 
that in `_create_udf` (`UserDefinedFunction._wrapped` needs the original 
function for its documentation and other stuff). I will also simplify the code 
as you suggested, yes


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to