[ 
https://issues.apache.org/jira/browse/SPARK-27992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bryan Cutler updated SPARK-27992:
---------------------------------
    Description: 
Both SPARK-27805 and SPARK-27548 identified an issue that errors in a Spark job 
are not propagated to Python. This is because toLocalIterator() and toPandas() 
with Arrow enabled run Spark jobs asynchronously in a background thread, after 
creating the socket connection info. The fix for these was to catch a 
SparkException if the job errored and then send the exception through the 
pyspark serializer.

A better fix would be to allow Python to synchronize on the serving thread 
future. That way if the serving thread throws an exception, it will be 
propagated on the synchronization call.

> PySpark socket server should sync with JVM connection thread future
> -------------------------------------------------------------------
>
>                 Key: SPARK-27992
>                 URL: https://issues.apache.org/jira/browse/SPARK-27992
>             Project: Spark
>          Issue Type: Improvement
>          Components: PySpark
>    Affects Versions: 2.4.3
>            Reporter: Bryan Cutler
>            Priority: Major
>
> Both SPARK-27805 and SPARK-27548 identified an issue that errors in a Spark 
> job are not propagated to Python. This is because toLocalIterator() and 
> toPandas() with Arrow enabled run Spark jobs asynchronously in a background 
> thread, after creating the socket connection info. The fix for these was to 
> catch a SparkException if the job errored and then send the exception through 
> the pyspark serializer.
> A better fix would be to allow Python to synchronize on the serving thread 
> future. That way if the serving thread throws an exception, it will be 
> propagated on the synchronization call.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to