[ https://issues.apache.org/jira/browse/HIVE-15054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15608359#comment-15608359 ]
Aihua Xu edited comment on HIVE-15054 at 10/26/16 12:50 PM: ------------------------------------------------------------ [~lirui] Thanks for taking a look. It would be hard to repro. It depends on which state the first executor is when it's aborted or dies. You will see such issue when the task is done with the writing the data to a tmp file and renaming to the file tmp file while at that time spark kills such task in your case or the executor loses the connection. The case I have seen is, the connection to the executor times out but the executor is almost done with its work (the result is finished writing and renamed to the final tmp file and only thing left is to report to the driver that the task is done). If the rename doesn't happen, then you won't see such issue. was (Author: aihuaxu): [~lirui] Thanks for taking a look. It would be hard to repro. It depends on which state the first executor is when it's aborted or dies. You will see such issue when the task is done with the writing the data to a tmp file and renaming to the file tmp file but spark kills such task in your case or the executor loses the connection at that time. The case I have seen is, the connection to the executor times out but the executor is almost done with its work (the result is finished writing and renamed to the final tmp file and only thing left is to report to the driver that the task is done). If the rename doesn't happen, then you won't see such issue. > Hive insertion query execution fails on Hive on Spark > ----------------------------------------------------- > > Key: HIVE-15054 > URL: https://issues.apache.org/jira/browse/HIVE-15054 > Project: Hive > Issue Type: Bug > Components: Spark > Affects Versions: 2.0.0 > Reporter: Aihua Xu > Assignee: Aihua Xu > Attachments: HIVE-15054.1.patch > > > The query of {{insert overwrite table tbl1}} sometimes will fail with the > following errors. Seems we are constructing taskAttemptId with partitionId > which is not unique if there are multiple attempts. > {noformat} > ava.lang.IllegalStateException: Hit error while closing operators - failing > tree: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename > output from: > hdfs://table1/.hive-staging_hive_2016-06-14_01-53-17_386_3231646810118049146-9/_task_tmp.-ext-10002/_tmp.002148_0 > to: > hdfs://table1/.hive-staging_hive_2016-06-14_01-53-17_386_3231646810118049146-9/_tmp.-ext-10002/002148_0 > at > org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:202) > at > org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:58) > at > org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:106) > at > scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at > org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$15.apply(AsyncRDDActions.scala:120) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)