[jira] [Comment Edited] (HIVE-15054) Hive insertion query execution fails on Hive on Spark

Aihua Xu (JIRA) Wed, 26 Oct 2016 05:52:08 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-15054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15608359#comment-15608359
 ]


Aihua Xu edited comment on HIVE-15054 at 10/26/16 12:50 PM:
------------------------------------------------------------

[~lirui] Thanks for taking a look. It would be hard to repro. It depends on 
which state the first executor is when it's aborted or dies. You will see such 
issue when the task is done with the writing the data to a tmp file and 
renaming to the file tmp file while at that time spark kills such task in your 
case or the executor loses the connection. The case I have seen is,  the 
connection to the executor times out but the executor is almost done with its 
work (the result is finished writing and renamed to the final tmp file and only 
thing left is to report to the driver that the task is done).  

 If the rename doesn't happen, then you won't see such issue. 


was (Author: aihuaxu):
[~lirui] Thanks for taking a look. It would be hard to repro. It depends on 
which state the first executor is when it's aborted or dies. You will see such 
issue when the task is done with the writing the data to a tmp file and 
renaming to the file tmp file but spark kills such task in your case or the 
executor loses the connection at that time. The case I have seen is,  the 
connection to the executor times out but the executor is almost done with its 
work (the result is finished writing and renamed to the final tmp file and only 
thing left is to report to the driver that the task is done).  

 If the rename doesn't happen, then you won't see such issue. 

> Hive insertion query execution fails on Hive on Spark
> -----------------------------------------------------
>
>                 Key: HIVE-15054
>                 URL: https://issues.apache.org/jira/browse/HIVE-15054
>             Project: Hive
>          Issue Type: Bug
>          Components: Spark
>    Affects Versions: 2.0.0
>            Reporter: Aihua Xu
>            Assignee: Aihua Xu
>         Attachments: HIVE-15054.1.patch
>
>
> The query of {{insert overwrite table tbl1}} sometimes will fail with the 
> following errors. Seems we are constructing taskAttemptId with partitionId 
> which is not unique if there are multiple attempts.
> {noformat}
> ava.lang.IllegalStateException: Hit error while closing operators - failing 
> tree: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename 
> output from: 
> hdfs://table1/.hive-staging_hive_2016-06-14_01-53-17_386_3231646810118049146-9/_task_tmp.-ext-10002/_tmp.002148_0
>  to: 
> hdfs://table1/.hive-staging_hive_2016-06-14_01-53-17_386_3231646810118049146-9/_tmp.-ext-10002/002148_0
> at 
> org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:202)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:58)
> at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:106)
> at 
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at 
> org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-15054) Hive insertion query execution fails on Hive on Spark

Reply via email to