[ https://issues.apache.org/jira/browse/SPARK-27992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bryan Cutler resolved SPARK-27992. ---------------------------------- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 24834 [https://github.com/apache/spark/pull/24834] > PySpark socket server should sync with JVM connection thread future > ------------------------------------------------------------------- > > Key: SPARK-27992 > URL: https://issues.apache.org/jira/browse/SPARK-27992 > Project: Spark > Issue Type: Improvement > Components: PySpark > Affects Versions: 3.0.0 > Reporter: Bryan Cutler > Assignee: Bryan Cutler > Priority: Major > Fix For: 3.0.0 > > > Both SPARK-27805 and SPARK-27548 identified an issue that errors in a Spark > job are not propagated to Python. This is because toLocalIterator() and > toPandas() with Arrow enabled run Spark jobs asynchronously in a background > thread, after creating the socket connection info. The fix for these was to > catch a SparkException if the job errored and then send the exception through > the pyspark serializer. > A better fix would be to allow Python to await on the serving thread future > and join the thread. That way if the serving thread throws an exception, it > will be propagated on the call to awaitResult. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org