[ https://issues.apache.org/jira/browse/HIVE-24322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aman Raj updated HIVE-24322: ---------------------------- Fix Version/s: (was: 3.2.0) > In case of direct insert, the attempt ID has to be checked when reading the > manifest files > ------------------------------------------------------------------------------------------ > > Key: HIVE-24322 > URL: https://issues.apache.org/jira/browse/HIVE-24322 > Project: Hive > Issue Type: Bug > Affects Versions: 4.0.0 > Reporter: Marta Kuczora > Assignee: Marta Kuczora > Priority: Major > Labels: pull-request-available > Fix For: 4.0.0-alpha-1 > > Time Spent: 20m > Remaining Estimate: 0h > > In IMPALA-10247 there was an exception from Hive when tyring to load the data: > {noformat} > 2020-10-13T16:50:53,424 ERROR [HiveServer2-Background-Pool: Thread-23832] > exec.Task: Job Commit failed with exception > 'org.apache.hadoop.hive.ql.metadata.HiveException(java.io.EOFException)' > org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.jobCloseOp(FileSinkOperator.java:1468) > at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:798) > at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:803) > at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:803) > at org.apache.hadoop.hive.ql.exec.tez.TezTask.close(TezTask.java:627) > at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:342) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) > at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357) > at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) > at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) > at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482) > at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:225) > at > org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87) > at > org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:322) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) > at > org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:340) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.io.EOFException > at java.io.DataInputStream.readInt(DataInputStream.java:392) > at > org.apache.hadoop.hive.ql.exec.Utilities.handleDirectInsertTableFinalPath(Utilities.java:4587) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.jobCloseOp(FileSinkOperator.java:1462) > ... 29 more > {noformat} > The reason of the exception was that Hive was trying to read an empty > manifest file. Manifest files are used in case of direct insert to determine > which files needs to be kept and which one needs to be cleaned up. They are > created by the tasks and they use the task attempt Id as postfix. In this > particular test what happened is that one of the container ran out of memory > so Tez decided to kill it right after the manifest file got created but > before the paths got written into the manifest file. This was the manifest > file for the task attempt 0. Then Tez assigned a new container to the task, > so a new attempt was made with attemptId=1. This one was successful, and > wrote the manifest file correctly. But Hive didn't know about this, since > this out of memory issue got handled by Tez under the hood, so there was no > exception in Hive, therefore no clean-up in the manifest folder. And when > Hive is reading the manifest files, it just reads every file from the defined > folder, so it tried to read the manifest files for attempt 0 and 1 as well. > If there are multiple manifest files with the same name but different > attemptId, Hive should only read the one with the biggest attempt Id. -- This message was sent by Atlassian Jira (v8.20.10#820010)