[jira] [Created] (SPARK-27992) PySpark socket server should sync with JVM connection thread future

Bryan Cutler (JIRA) Mon, 10 Jun 2019 14:48:40 -0700

Bryan Cutler created SPARK-27992:
------------------------------------

             Summary: PySpark socket server should sync with JVM connection 
thread future
                 Key: SPARK-27992
                 URL: https://issues.apache.org/jira/browse/SPARK-27992
             Project: Spark
          Issue Type: Improvement
          Components: PySpark
    Affects Versions: 2.4.3
         Environment: Both SPARK-27805 and SPARK-27548 identified an issue that 
errors in a Spark job are not propagated to Python. This is because 
toLocalIterator() and toPandas() with Arrow enabled run Spark jobs 
asynchronously in a background thread, after creating the socket connection 
info. The fix for these was to catch a SparkException if the job errored and 
then send the exception through the pyspark serializer.


A better fix would be to allow Python to synchronize on the serving thread 
future. That way if the serving thread throws an exception, it will be 
propagated on the synchronization call.
            Reporter: Bryan Cutler






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-27992) PySpark socket server should sync with JVM connection thread future

Reply via email to