Using the hadoop-aws package is probably going to be a little more
complicated than that. The best bet is to use a custom build of Spark
that includes it (use -Phadoop-cloud). Otherwise you're probably
looking at some nasty dependency issues, especially if you end up
mixing different versions of Hadoop.

On Fri, Jun 1, 2018 at 4:01 PM, Nicholas Chammas
<nicholas.cham...@gmail.com> wrote:
> I was able to successfully launch a Spark cluster on EC2 at 2.3.1 RC4 using
> Flintrock. However, trying to load the hadoop-aws package gave me some
> errors.
>
> $ pyspark --packages org.apache.hadoop:hadoop-aws:2.8.4
>
> <snipped>
>
> :: problems summary ::
> :::: WARNINGS
>                 [NOT FOUND  ]
> com.sun.jersey#jersey-json;1.9!jersey-json.jar(bundle) (2ms)
>         ==== local-m2-cache: tried
>
> file:/home/ec2-user/.m2/repository/com/sun/jersey/jersey-json/1.9/jersey-json-1.9.jar
>                 [NOT FOUND  ]
> com.sun.jersey#jersey-server;1.9!jersey-server.jar(bundle) (0ms)
>         ==== local-m2-cache: tried
>
> file:/home/ec2-user/.m2/repository/com/sun/jersey/jersey-server/1.9/jersey-server-1.9.jar
>                 [NOT FOUND  ]
> org.codehaus.jettison#jettison;1.1!jettison.jar(bundle) (1ms)
>         ==== local-m2-cache: tried
>
> file:/home/ec2-user/.m2/repository/org/codehaus/jettison/jettison/1.1/jettison-1.1.jar
>                 [NOT FOUND  ]
> com.sun.xml.bind#jaxb-impl;2.2.3-1!jaxb-impl.jar (0ms)
>         ==== local-m2-cache: tried
>
> file:/home/ec2-user/.m2/repository/com/sun/xml/bind/jaxb-impl/2.2.3-1/jaxb-impl-2.2.3-1.jar
>
> I’d guess I’m probably using the wrong version of hadoop-aws, but I called
> make-distribution.sh with -Phadoop-2.8 so I’m not sure what else to try.
>
> Any quick pointers?
>
> Nick
>
>
> On Fri, Jun 1, 2018 at 6:29 PM Marcelo Vanzin <van...@cloudera.com> wrote:
>>
>> Starting with my own +1 (binding).
>>
>> On Fri, Jun 1, 2018 at 3:28 PM, Marcelo Vanzin <van...@cloudera.com>
>> wrote:
>> > Please vote on releasing the following candidate as Apache Spark version
>> > 2.3.1.
>> >
>> > Given that I expect at least a few people to be busy with Spark Summit
>> > next
>> > week, I'm taking the liberty of setting an extended voting period. The
>> > vote
>> > will be open until Friday, June 8th, at 19:00 UTC (that's 12:00 PDT).
>> >
>> > It passes with a majority of +1 votes, which must include at least 3 +1
>> > votes
>> > from the PMC.
>> >
>> > [ ] +1 Release this package as Apache Spark 2.3.1
>> > [ ] -1 Do not release this package because ...
>> >
>> > To learn more about Apache Spark, please see http://spark.apache.org/
>> >
>> > The tag to be voted on is v2.3.1-rc4 (commit 30aaa5a3):
>> > https://github.com/apache/spark/tree/v2.3.1-rc4
>> >
>> > The release files, including signatures, digests, etc. can be found at:
>> > https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc4-bin/
>> >
>> > Signatures used for Spark RCs can be found in this file:
>> > https://dist.apache.org/repos/dist/dev/spark/KEYS
>> >
>> > The staging repository for this release can be found at:
>> > https://repository.apache.org/content/repositories/orgapachespark-1272/
>> >
>> > The documentation corresponding to this release can be found at:
>> > https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc4-docs/
>> >
>> > The list of bug fixes going into 2.3.1 can be found at the following
>> > URL:
>> > https://issues.apache.org/jira/projects/SPARK/versions/12342432
>> >
>> > FAQ
>> >
>> > =========================
>> > How can I help test this release?
>> > =========================
>> >
>> > If you are a Spark user, you can help us test this release by taking
>> > an existing Spark workload and running on this release candidate, then
>> > reporting any regressions.
>> >
>> > If you're working in PySpark you can set up a virtual env and install
>> > the current RC and see if anything important breaks, in the Java/Scala
>> > you can add the staging repository to your projects resolvers and test
>> > with the RC (make sure to clean up the artifact cache before/after so
>> > you don't end up building with a out of date RC going forward).
>> >
>> > ===========================================
>> > What should happen to JIRA tickets still targeting 2.3.1?
>> > ===========================================
>> >
>> > The current list of open tickets targeted at 2.3.1 can be found at:
>> > https://s.apache.org/Q3Uo
>> >
>> > Committers should look at those and triage. Extremely important bug
>> > fixes, documentation, and API tweaks that impact compatibility should
>> > be worked on immediately. Everything else please retarget to an
>> > appropriate release.
>> >
>> > ==================
>> > But my bug isn't fixed?
>> > ==================
>> >
>> > In order to make timely releases, we will typically not hold the
>> > release unless the bug in question is a regression from the previous
>> > release. That being said, if there is something which is a regression
>> > that has not been correctly targeted please ping me or a committer to
>> > help target the issue.
>> >
>> >
>> > --
>> > Marcelo
>>
>>
>>
>> --
>> Marcelo
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to