In your case, I would specify "fs.s3.awsAccessKeyId" /
"fs.s3.awsSecretAccessKey" since you use s3 protocol.

On Sun, Aug 23, 2015 at 11:03 AM, lostrain A <donotlikeworkingh...@gmail.com
> wrote:

> Hi Ted,
>   Thanks for the reply. I tried setting both of the keyid and accesskey via
>
> sc.hadoopConfiguration.set("fs.s3n.awsAccessKeyId", "***")
>> sc.hadoopConfiguration.set("fs.s3n.awsSecretAccessKey", "**")
>
>
> However, the error still occurs for ORC format.
>
> If I change the format to JSON, although the error does not go, the JSON
> files can be saved successfully.
>
>
>
>
> On Sun, Aug 23, 2015 at 5:51 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>
>> You may have seen this:
>> http://search-hadoop.com/m/q3RTtdSyM52urAyI
>>
>>
>>
>> On Aug 23, 2015, at 1:01 AM, lostrain A <donotlikeworkingh...@gmail.com>
>> wrote:
>>
>> Hi,
>>   I'm trying to save a simple dataframe to S3 in ORC format. The code is
>> as follows:
>>
>>
>>      val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
>>>       import sqlContext.implicits._
>>>       val df=sc.parallelize(1 to 1000).toDF()
>>>       df.write.format("orc").save("s3://logs/dummy)
>>
>>
>> I ran the above code in spark-shell and only the _SUCCESS file was saved
>> under the directory.
>> The last part of the spark-shell log said:
>>
>> 15/08/23 07:38:23 task-result-getter-1 INFO TaskSetManager: Finished task
>>> 95.0 in stage 2.0 (TID 295) in 801 ms on ip-*-*-*-*.ec2.internal (100/100)
>>>
>>
>>
>>> 15/08/23 07:38:23 dag-scheduler-event-loop INFO DAGScheduler:
>>> ResultStage 2 (save at <console>:29) finished in 0.834 s
>>>
>>
>>
>>> 15/08/23 07:38:23 task-result-getter-1 INFO YarnScheduler: Removed
>>> TaskSet 2.0, whose tasks have all completed, from pool
>>>
>>
>>
>>> 15/08/23 07:38:23 main INFO DAGScheduler: Job 2 finished: save at
>>> <console>:29, took 0.895912 s
>>>
>>
>>
>>> 15/08/23 07:38:24 main INFO
>>> LocalDirAllocator$AllocatorPerContext$DirSelector: Returning directory:
>>> /media/ephemeral0/s3/output-
>>>
>>
>>
>>> 15/08/23 07:38:24 main ERROR NativeS3FileSystem: md5Hash for
>>> dummy/_SUCCESS is [-44, 29, -128, -39, -113, 0, -78,
>>>  4, -23, -103, 9, -104, -20, -8, 66, 126]
>>>
>>
>>
>>> 15/08/23 07:38:24 main INFO DefaultWriterContainer: Job job_****_****
>>> committed.
>>
>>
>> Anyone has experienced this before?
>> Thanks!
>>
>>
>>
>

Reply via email to