[ 
https://issues.apache.org/jira/browse/SPARK-33041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17205059#comment-17205059
 ] 

Russell Spitzer commented on SPARK-33041:
-----------------------------------------

To elaborate, this could be the case for any failure that occurs after the 
connection file is written. For example, say the OOM killer comes in and shuts 
down the gateway or there is some other fatal failure of the gateway. These 
cases would also result in the rather opaque messages about queues and 
networking, when we know the actual problem is that the gateway is shutdown.

> Better error messages when PySpark Java Gateway Crashes
> -------------------------------------------------------
>
>                 Key: SPARK-33041
>                 URL: https://issues.apache.org/jira/browse/SPARK-33041
>             Project: Spark
>          Issue Type: Improvement
>          Components: PySpark
>    Affects Versions: 2.4.7, 3.0.1
>            Reporter: Russell Spitzer
>            Priority: Major
>
> Currently the startup works by opening the Gateway process and waiting until 
> the the process has written the conn_info_file. Once the conn_file is written 
> it proceeds to attempt to connect to the port.
> This connection can succeed and the process can start normally, but if the 
> gateway process dies or is killed the error that the user ends up getting is 
> a confusing "connection_failed" style error like
> {code}
> Traceback (most recent call last):
>   File 
> "/usr/lib/spark-packages/spark2.4.4/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py",
>  line 929, in _get_connection
>     connection = self.deque.pop()
> IndexError: pop from an empty deque
> {code}
> Since we have a handle on the py4j process, we should probably check whether 
> it has terminated before surfacing any exceptions like this. 
> CC [~holden]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to