Re: Error when saving a dataframe as ORC file

Ted Yu Sun, 23 Aug 2015 05:52:30 -0700

You may have seen this:
http://search-hadoop.com/m/q3RTtdSyM52urAyI




> On Aug 23, 2015, at 1:01 AM, lostrain A <donotlikeworkingh...@gmail.com> 
> wrote:
> 
> Hi,
>   I'm trying to save a simple dataframe to S3 in ORC format. The code is as 
> follows:
> 
> 
>>      val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
>>       import sqlContext.implicits._
>>       val df=sc.parallelize(1 to 1000).toDF()
>>       df.write.format("orc").save("s3://logs/dummy)
> 
> I ran the above code in spark-shell and only the _SUCCESS file was saved 
> under the directory.
> The last part of the spark-shell log said:
> 
>> 15/08/23 07:38:23 task-result-getter-1 INFO TaskSetManager: Finished task 
>> 95.0 in stage 2.0 (TID 295) in 801 ms on ip-*-*-*-*.ec2.internal (100/100)
>  
>> 15/08/23 07:38:23 dag-scheduler-event-loop INFO DAGScheduler: ResultStage 2 
>> (save at <console>:29) finished in 0.834 s
>  
>> 15/08/23 07:38:23 task-result-getter-1 INFO YarnScheduler: Removed TaskSet 
>> 2.0, whose tasks have all completed, from pool
>  
>> 15/08/23 07:38:23 main INFO DAGScheduler: Job 2 finished: save at 
>> <console>:29, took 0.895912 s
>  
>> 15/08/23 07:38:24 main INFO 
>> LocalDirAllocator$AllocatorPerContext$DirSelector: Returning directory: 
>> /media/ephemeral0/s3/output-
>  
>> 15/08/23 07:38:24 main ERROR NativeS3FileSystem: md5Hash for dummy/_SUCCESS 
>> is [-44, 29, -128, -39, -113, 0, -78,
>>  4, -23, -103, 9, -104, -20, -8, 66, 126]
>  
>> 15/08/23 07:38:24 main INFO DefaultWriterContainer: Job job_****_**** 
>> committed.
> 
> Anyone has experienced this before?
> Thanks!
>

Re: Error when saving a dataframe as ORC file

Reply via email to