GitHub user JoshRosen opened a pull request:

    https://github.com/apache/spark/pull/4603

    [SPARK-2313] Use socket to communicate GatewayServer port back to Python 
driver

    This patch changes PySpark so that the GatewayServer's port is communicated 
back to the Python process that launches it over a local socket instead of a 
pipe.  The old pipe-based approach was brittle and could fail if `spark-submit` 
printed unexpected to stdout.
    
    To accomplish this, I wrote a custom `PythonGatewayServer.main()` function 
to use in place of Py4J's `GatewayServer.main()`.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/JoshRosen/spark SPARK-2313

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/4603.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4603
    
----
commit 8bf956ea3ac7d9af481acea3a14b9e48dc0ba2fa
Author: Josh Rosen <joshro...@databricks.com>
Date:   2015-02-14T01:01:09Z

    Initial cut at passing Py4J gateway port back to driver via socket

commit 2f70689aebed4dcee67d2dbc9ee42255f6324b5f
Author: Josh Rosen <joshro...@databricks.com>
Date:   2015-02-14T04:51:01Z

    Use stdin PIPE to share fate with driver

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to