[ https://issues.apache.org/jira/browse/FLINK-23725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17403526#comment-17403526 ]
todd commented on FLINK-23725: ------------------------------ I think that would be a problem if two files are written by different executions without consistent states (e.g start a new job without savepoints while keeping the files of previous executions). [~Paul Lin] You're right. However, if you throw it out, you need to ask the user to manually come up with the problem. This will encounter such troubles. > HadoopFsCommitter, file rename failure > -------------------------------------- > > Key: FLINK-23725 > URL: https://issues.apache.org/jira/browse/FLINK-23725 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem, Connectors / Hadoop > Compatibility, FileSystems > Affects Versions: 1.11.1, 1.12.1 > Reporter: todd > Priority: Major > > When the HDFS file is written, if the part file exists, only false will be > returned if the duplicate name fails.Whether to throw an exception that > already exists in the part, or print related logs. > > ``` > org.apache.flink.runtime.fs.hdfs.HadoopRecoverableFsDataOutputStream.HadoopFsCommitter#commit > public void commit() throws IOException { > final Path src = recoverable.tempFile(); > final Path dest = recoverable.targetFile(); > final long expectedLength = recoverable.offset(); > try { > //always return false or ture > fs.rename(src, dest); > } catch (IOException e) { > throw new IOException( > "Committing file by rename failed: " + src + " to " + dest, e); > } > } > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)