Note: EMR builds Hadoop, Spark, et al, from source against specific
versions of certain packages like the AWS Java SDK, httpclient/core,
Jackson, etc., sometimes requiring some patches in these applications in
order to work with versions of these dependencies that differ from what the
applications may support upstream.

For emr-5.8.0, we have built Hadoop and Spark (the Spark Kinesis connector,
that is, since that's the only part of Spark that actually depends upon the
AWS Java SDK directly) against AWS Java SDK 1.11.160 instead of the much
older version that vanilla Hadoop 2.7.3 would otherwise depend upon.

~ Jonathan

On Wed, Oct 4, 2017 at 7:17 AM Steve Loughran <ste...@hortonworks.com>
wrote:

> On 3 Oct 2017, at 21:37, JG Perrin <jper...@lumeris.com> wrote:
>
> Sorry Steve – I may not have been very clear: thinking about
> aws-java-sdk-z.yy.xxx.jar. To the best of my knowledge, none is bundled
> with Spark.
>
>
>
> I know, but if you are talking to s3 via the s3a client, you will need the
> SDK version to match the hadoop-aws JAR of the same version of Hadoop your
> JARs have. Similarly, if you were using spark-kinesis, it needs to be in
> sync there.
>
>
> *From:* Steve Loughran [mailto:ste...@hortonworks.com
> <ste...@hortonworks.com>]
> *Sent:* Tuesday, October 03, 2017 2:20 PM
> *To:* JG Perrin <jper...@lumeris.com>
> *Cc:* user@spark.apache.org
> *Subject:* Re: Quick one... AWS SDK version?
>
>
>
> On 3 Oct 2017, at 02:28, JG Perrin <jper...@lumeris.com> wrote:
>
> Hey Sparkians,
>
> What version of AWS Java SDK do you use with Spark 2.2? Do you stick with
> the Hadoop 2.7.3 libs?
>
>
> You generally to have to stick with the version which hadoop was built
> with I'm afraid...very brittle dependency.
>
>

Reply via email to