[ https://issues.apache.org/jira/browse/SPARK-21772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16133271#comment-16133271 ]
Xiao Li commented on SPARK-21772: --------------------------------- https://github.com/apache/spark/pull/18985 > HiveException unable to move results from srcf to destf in > InsertIntoHiveTable > ------------------------------------------------------------------------------- > > Key: SPARK-21772 > URL: https://issues.apache.org/jira/browse/SPARK-21772 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.1.1, 2.2.0 > Environment: JDK1.7 > CentOS 6.3 > Spark2.1 > Reporter: liupengcheng > Labels: sql > > Currently, when execute {code:java} create table as select {code} would > return Exception: > {code:java} > 2017-08-17,16:14:18,792 ERROR > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation: Error > executing query, currentState RUNNING, > java.lang.reflect.InvocationTargetException > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.sql.hive.client.Shim_v0_12.loadTable(HiveShim.scala:346) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$loadTable$1.apply$mcV$sp(HiveClientImpl.scala:770) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$loadTable$1.apply(HiveClientImpl.scala:770) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$loadTable$1.apply(HiveClientImpl.scala:770) > at > org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:316) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:262) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:261) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:305) > at > org.apache.spark.sql.hive.client.HiveClientImpl.loadTable(HiveClientImpl.scala:769) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$loadTable$1.apply$mcV$sp(HiveExternalCatalog.scala:765) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$loadTable$1.apply(HiveExternalCatalog.scala:763) > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$loadTable$1.apply(HiveExternalCatalog.scala:763) > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:100) > at > org.apache.spark.sql.hive.HiveExternalCatalog.loadTable(HiveExternalCatalog.scala:763) > at > org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult$lzycompute(InsertIntoHiveTable.scala:323) > at > org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult(InsertIntoHiveTable.scala:170) > at > org.apache.spark.sql.hive.execution.InsertIntoHiveTable.doExecute(InsertIntoHiveTable.scala:347) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:120) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:120) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:141) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:138) > at > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:119) > at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92) > at > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92) > at > org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand.run(CreateHiveTableAsSelectCommand.scala:92) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:120) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:120) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:141) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:138) > at > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:119) > at > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92) > at > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92) > at org.apache.spark.sql.Dataset.<init>(Dataset.scala:187) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64) > at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:592) > at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:699) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:222) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:162) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:159) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1854) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:172) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move > results from > hdfs://c3prc-micloud/user/sql_prc/warehouse/finance.db/lmq_tmp_s_1/.hive-staging_hive_2017-08-17_16-14-03_391_4386766533807462399-1/-ext-10000 > to destination directory: > hdfs://c3prc-micloud/user/sql_prc/warehouse/finance.db/lmq_tmp_s_1 > at > org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:2389) > at > org.apache.hadoop.hive.ql.metadata.Table.replaceFiles(Table.java:673) > at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1490) > ... 57 more > {code} > I think, it's due to the incorrect tmpLocation is set. for now, the default > tmpLocation is `<sqlwarehousedir>/<db>/<table>/.hive-staging...`, but when > results data is genereted and > InsertIntoHiveTable started loading table, it would call repalceFiles > function, this function would delete `<sqlwarehousedir>/<db>/<table>` > directory and then put in generated results. > However, the deletion will delete the generated results, as a result, the > replaceFiles would fail. > the HiveException is thrown. > the tmpLocation should be fixed. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org