[ https://issues.apache.org/jira/browse/SPARK-27992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-27992: ------------------------------------ Assignee: (was: Apache Spark) > PySpark socket server should sync with JVM connection thread future > ------------------------------------------------------------------- > > Key: SPARK-27992 > URL: https://issues.apache.org/jira/browse/SPARK-27992 > Project: Spark > Issue Type: Improvement > Components: PySpark > Affects Versions: 3.0.0 > Reporter: Bryan Cutler > Priority: Major > > Both SPARK-27805 and SPARK-27548 identified an issue that errors in a Spark > job are not propagated to Python. This is because toLocalIterator() and > toPandas() with Arrow enabled run Spark jobs asynchronously in a background > thread, after creating the socket connection info. The fix for these was to > catch a SparkException if the job errored and then send the exception through > the pyspark serializer. > A better fix would be to allow Python to synchronize on the serving thread > future. That way if the serving thread throws an exception, it will be > propagated on the synchronization call. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org