GitHub user sryza opened a pull request:

    https://github.com/apache/spark/pull/30

    SPARK-1004.  PySpark on YARN

    This reopens https://github.com/apache/incubator-spark/pull/640 against the 
new repo

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sryza/spark sandy-spark-1004

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/30.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #30
    
----
commit e49ff667154de988a1cb58d90c9743c6c24ef5bc
Author: Josh Rosen <joshro...@apache.org>
Date:   2014-01-24T18:19:58Z

    Automatically set Yarn env vars in PySpark (SPARK-1030).

commit 59ac972026a7600fded49d906ef27bbb017fc9d2
Author: Josh Rosen <joshro...@apache.org>
Date:   2014-01-25T23:28:56Z

    WIP towards PySpark on YARN:
    
    - Remove reliance on SPARK_HOME on the workers.  Only the driver
      should know about SPARK_HOME.  On the workers, we ensure that the
      PySpark Python libraries are added to the PYTHONPATH.
    
    - Add a Makefile for generating a "fat zip" that contains PySpark's
      Python dependencies.  This is a bit of a hack and I'd be open to
      better packaging tools, but this doesn't require any extra Python
      libraries.  This use case doesn't seem to be well-addressed by the
      existing Python packaging tools: there are plenty of tools to package
      complete Python environments (such as pyinstaller and virtualenv) or
      to bundle *individual* libraries (e.g. distutils), but few to generate
      portable fat zips or eggs.
    
    This hasn't been tested with YARN and may not actually compile.

commit 54bd8c0aec51d5d5cb24d6453dea2fb627db05cd
Author: Josh Rosen <joshro...@apache.org>
Date:   2014-02-19T06:27:21Z

    Add missing setup.py file for PySpark.

commit 514b2d0cfc8995b86186d02aebf61500d25df7db
Author: Sandy Ryza <sa...@cloudera.com>
Date:   2014-02-24T07:06:42Z

    Improvements

commit ee3cc204dcabd7d092e3d6ed205e01c5deffc7ca
Author: Sandy Ryza <sa...@cloudera.com>
Date:   2014-02-24T07:26:01Z

    Don't set SPARK_JAR

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to