Ah, thanks for the pointers.

So as far as Spark is concerned, is this a breaking change? Is it possible
that people who have working code that accesses S3 will upgrade to use
Spark-against-Hadoop-2.6 and find their code is not working all of a sudden?

Nick

On Thu, May 7, 2015 at 12:48 PM Peter Rudenko <petro.rude...@gmail.com>
wrote:

>  Yep it's a Hadoop issue:
> https://issues.apache.org/jira/browse/HADOOP-11863
>
>
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201504.mbox/%3CCA+XUwYxPxLkfhOxn1jNkoUKEQQMcPWFzvXJ=u+kp28kdejo...@mail.gmail.com%3E
> http://stackoverflow.com/a/28033408/3271168
>
>
> So for now need to manually add that jar to classpath on hadoop-2.6.
>
> Thanks,
> Peter Rudenko
>
> On 2015-05-07 19:41, Nicholas Chammas wrote:
>
> I can try that, but the issue is I understand this is supposed to work out
> of the box (like it does with all the other Spark/Hadoop pre-built
> packages).
>
> On Thu, May 7, 2015 at 12:35 PM Peter Rudenko <petro.rude...@gmail.com>
> wrote:
>
>>  Try to download this jar:
>>
>> http://search.maven.org/remotecontent?filepath=org/apache/hadoop/hadoop-aws/2.6.0/hadoop-aws-2.6.0.jar
>>
>> And add:
>>
>> export CLASSPATH=$CLASSPATH:hadoop-aws-2.6.0.jar
>>
>> And try to relaunch.
>>
>> Thanks,
>> Peter Rudenko
>>
>>
>> On 2015-05-07 19:30, Nicholas Chammas wrote:
>>
>>  Hmm, I just tried changing s3n to s3a:
>>
>> py4j.protocol.Py4JJavaError: An error occurred while calling 
>> z:org.apache.spark.api.python.PythonRDD.collectAndServe.
>> : java.lang.RuntimeException: java.lang.ClassNotFoundException: Class 
>> org.apache.hadoop.fs.s3a.S3AFileSystem not found
>>
>> Nick
>> ​
>>
>> On Thu, May 7, 2015 at 12:29 PM Peter Rudenko <petro.rude...@gmail.com>
>> wrote:
>>
>>>  Hi Nick, had the same issue.
>>> By default it should work with s3a protocol:
>>>
>>> sc.textFile('s3a://bucket/file_*').count()
>>>
>>>
>>> If you want to use s3n protocol you need to add hadoop-aws.jar to
>>> spark's classpath. Wich hadoop vendor (Hortonworks, Cloudera, MapR) do you
>>> use?
>>>
>>> Thanks,
>>> Peter Rudenko
>>>
>>> On 2015-05-07 19:25, Nicholas Chammas wrote:
>>>
>>> Details are here: https://issues.apache.org/jira/browse/SPARK-7442
>>>
>>> It looks like something specific to building against Hadoop 2.6?
>>>
>>> Nick
>>>
>>>
>>>
>>>
>>
>

Reply via email to