[ https://issues.apache.org/jira/browse/FLINK-23725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17403698#comment-17403698 ]
todd commented on FLINK-23725: ------------------------------ [~sewen] The target part file already exists. If you restore from the old ck, the old part file will be used. Because the rename operation will only return true or fasle, and will not overwrite the old file content. If the task is restored from an earlier state snapshot, it is difficult to ensure data consistency. For example, the current part-0-10 file is generated by ck-10, and my task is restored from ck-0-5. The inprogress file generated by the new task will not be renamed to the final part file, because it already exists in the current directory. Part files that are reserved at one time. > HadoopFsCommitter, file rename failure > -------------------------------------- > > Key: FLINK-23725 > URL: https://issues.apache.org/jira/browse/FLINK-23725 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem, Connectors / Hadoop > Compatibility, FileSystems > Affects Versions: 1.11.1, 1.12.1 > Reporter: todd > Priority: Major > > When the HDFS file is written, if the part file exists, only false will be > returned if the duplicate name fails.Whether to throw an exception that > already exists in the part, or print related logs. > > ``` > org.apache.flink.runtime.fs.hdfs.HadoopRecoverableFsDataOutputStream.HadoopFsCommitter#commit > public void commit() throws IOException { > final Path src = recoverable.tempFile(); > final Path dest = recoverable.targetFile(); > final long expectedLength = recoverable.offset(); > try { > //always return false or ture > fs.rename(src, dest); > } catch (IOException e) { > throw new IOException( > "Committing file by rename failed: " + src + " to " + dest, e); > } > } > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)