GitHub user yhuai opened a pull request:

    https://github.com/apache/spark/pull/9979

    [SPARK-11998] [SQL] When downloading Hadoop artifacts from maven, we need 
to try to download the version that is used by Spark

    If we need to download Hive/Hadoop artifacts, try to download a Hadoop that 
matches the Hadoop used by Spark. If the Hadoop artifact cannot be resolved 
(e.g. Hadoop version is a vendor specific version like 2.0.0-cdh4.1.1), we will 
use Hadoop 2.4.0 (we used to hard code this version as the hadoop that we will 
download from maven) and we will not share Hadoop classes.
    
    I tested this match in my laptop with the following confs (these confs are 
used by our builds). All tests are good.
    ```
    build/sbt -Phadoop-1 -Dhadoop.version=1.2.1 -Pkinesis-asl 
-Phive-thriftserver -Phive
    build/sbt -Phadoop-1 -Dhadoop.version=2.0.0-mr1-cdh4.1.1 -Pkinesis-asl 
-Phive-thriftserver -Phive
    build/sbt -Pyarn -Phadoop-2.2 -Pkinesis-asl -Phive-thriftserver -Phive
    build/sbt -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Pkinesis-asl 
-Phive-thriftserver -Phive
    ```

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/yhuai/spark versionsSuite

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/9979.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #9979
    
----
commit 7f3e4de4801714c4985a029979b1e9b764a16b93
Author: Yin Huai <yh...@databricks.com>
Date:   2015-11-25T22:25:38Z

    If we need to download Hive/Hadoop artifacts, try to download a Hadoop that 
matches the Hadoop used by Spark. If the Hadoop artifact cannot be resolved 
(e.g. Hadoop version is a vendor specific version like 2.0.0-cdh4.1.1), we will 
use Hadoop 2.4.0 (we used to hard code this version as the hadoop that we will 
download from maven) and we will not share Hadoop classes.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to