[ https://issues.apache.org/jira/browse/HIVE-26933?focusedWorklogId=841281&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-841281 ]
ASF GitHub Bot logged work on HIVE-26933: ----------------------------------------- Author: ASF GitHub Bot Created on: 24/Jan/23 05:13 Start Date: 24/Jan/23 05:13 Worklog Time Spent: 10m Work Description: harshal-16 opened a new pull request, #3979: URL: https://github.com/apache/hive/pull/3979 Problem: - If Incremental Dump operation fails while dumping any event id in the staging directory. Then dump directory for this event id along with file _dumpmetadata still exists in the dump location. which is getting stored in _events_dump file - When user triggers dump operation for this policy again, It again resumes dumping from failed event id, and tries to dump it again but as that event id directory already created in previous cycle, it fails with the exception Solution: - fixed cleanFailedEventDirIfExists to remove folder for failed event id for a selected database Issue Time Tracking ------------------- Worklog Id: (was: 841281) Time Spent: 2h (was: 1h 50m) > Cleanup dump directory for eventId which was failed in previous dump cycle > -------------------------------------------------------------------------- > > Key: HIVE-26933 > URL: https://issues.apache.org/jira/browse/HIVE-26933 > Project: Hive > Issue Type: Improvement > Reporter: Harshal Patel > Assignee: Harshal Patel > Priority: Major > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > > # If Incremental Dump operation failes while dumping any event id in the > staging directory. Then dump directory for this event id along with file > _dumpmetadata still exists in the dump location. which is getting stored in > _events_dump file > # When user triggers dump operation for this policy again, It again resumes > dumping from failed event id, and tries to dump it again but as that event id > directory already created in previous cycle, it fails with the exception > {noformat} > [Scheduled Query Executor(schedule:repl_policytest7, execution_id:7181)]: > FAILED: Execution Error, return code 40000 from > org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask. > org.apache.hadoop.fs.FileAlreadyExistsException: > /warehouse/tablespace/staging/policytest7/dGVzdDc=/14bcf976-662b-4237-b5bb-e7d63a1d089f/hive/137961/_dumpmetadata > for client 172.27.182.5 already exists > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.startFile(FSDirWriteFileOp.java:388) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2576) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2473) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:773) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:490) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:533) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:989) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:917) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2894){noformat} > -- This message was sent by Atlassian Jira (v8.20.10#820010)