[ https://issues.apache.org/jira/browse/HIVE-28335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
László Bodor updated HIVE-28335: -------------------------------- Description: This is the followup after HIVE-27884 where I finally decided to leave deleteOnExit as is because I didn't need to change it. so in the scope of this we need to check and remove all deleteOnExit calls that belong to hadoop FileSystem objects (doesn't necessarily apply to java.io.File.deleteOnExit calls): {code} grep -iRH "deleteOnExit" --include="*.java" | grep -v "test" ... ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java: // in recent hadoop versions, use deleteOnExit to clean tmp files. ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java: autoDelete = fs.deleteOnExit(fsp.outPaths[filesIdx]); ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/util/PathInfo.java: fileSystem.deleteOnExit(dir); ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java: parentDir.deleteOnExit(); ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java: tmpFile.deleteOnExit(); ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/KeyValueContainer.java: parentDir.deleteOnExit(); ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/KeyValueContainer.java: tmpFile.deleteOnExit(); ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/ObjectContainer.java: tmpFile.deleteOnExit(); ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java: autoDelete = fs.deleteOnExit(outPath); {code} as a reference from previous ticket: [commit|https://github.com/abstractdog/hive/commit/7a9d299f6994ca5a8c17486549103b25692b5cba] it caused some hdfs counters difference in q.outs, needs to investigate was: This is the followup after HIVE-27884 where I finally decided to leave deleteOnExit as is because I didn't need to change it. so in the scope of this we need to check and remove all deleteOnExit calls that belong to hadoop FileSystem objects (doesn't necessarily apply to java.io.File.deleteOnExit calls): {code} grep -iRH "deleteOnExit" --include="*.java" | grep -v "test" ... ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java: // in recent hadoop versions, use deleteOnExit to clean tmp files. ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java: autoDelete = fs.deleteOnExit(fsp.outPaths[filesIdx]); ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/util/PathInfo.java: fileSystem.deleteOnExit(dir); ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java: parentDir.deleteOnExit(); ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java: tmpFile.deleteOnExit(); ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/KeyValueContainer.java: parentDir.deleteOnExit(); ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/KeyValueContainer.java: tmpFile.deleteOnExit(); ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/ObjectContainer.java: tmpFile.deleteOnExit(); ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java: autoDelete = fs.deleteOnExit(outPath); {code} as a reference from previous ticket: [commit|https://github.com/apache/hive/pull/4882/commits/7a9d299f6994ca5a8c17486549103b25692b5cba] it caused some hdfs counters difference in q.outs, needs to investigate > Review deleteOnExitUsage > ------------------------ > > Key: HIVE-28335 > URL: https://issues.apache.org/jira/browse/HIVE-28335 > Project: Hive > Issue Type: Improvement > Reporter: László Bodor > Priority: Major > > This is the followup after HIVE-27884 where I finally decided to leave > deleteOnExit as is because I didn't need to change it. > so in the scope of this we need to check and remove all deleteOnExit calls > that belong to hadoop FileSystem objects (doesn't necessarily apply to > java.io.File.deleteOnExit calls): > {code} > grep -iRH "deleteOnExit" --include="*.java" | grep -v "test" > ... > ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java: // > in recent hadoop versions, use deleteOnExit to clean tmp files. > ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java: > autoDelete = fs.deleteOnExit(fsp.outPaths[filesIdx]); > ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/util/PathInfo.java: > fileSystem.deleteOnExit(dir); > ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java: > parentDir.deleteOnExit(); > ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/RowContainer.java: > tmpFile.deleteOnExit(); > ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/KeyValueContainer.java: > parentDir.deleteOnExit(); > ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/KeyValueContainer.java: > tmpFile.deleteOnExit(); > ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/ObjectContainer.java: > tmpFile.deleteOnExit(); > ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java: > autoDelete = fs.deleteOnExit(outPath); > {code} > as a reference from previous ticket: > [commit|https://github.com/abstractdog/hive/commit/7a9d299f6994ca5a8c17486549103b25692b5cba] > it caused some hdfs counters difference in q.outs, needs to investigate -- This message was sent by Atlassian Jira (v8.20.10#820010)