Russell Spitzer created SPARK-33041:
---------------------------------------

             Summary: Better error messages when PySpark Java Gateway Fails to 
Start or Crashes
                 Key: SPARK-33041
                 URL: https://issues.apache.org/jira/browse/SPARK-33041
             Project: Spark
          Issue Type: Improvement
          Components: PySpark
    Affects Versions: 2.4.7
            Reporter: Russell Spitzer


Currently the startup works by opening the Gateway process and waiting until 
the the process has written the conn_info_file. Once the conn_file is written 
it proceeds to attempt to connect to the port.

This connection can succeed and the process can start normally, but if the 
gateway process dies or is killed the error that the user ends up getting is a 
confusing "connection_failed" style error like

{code}
Traceback (most recent call last):
  File 
"/usr/lib/spark-packages/spark2.4.4/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py",
 line 929, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque
{code}

Since we have a handle on the py4j process, we should probably check whether it 
has terminated before surfacing any exceptions like this. 

CC [~holden]




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to