[ 
https://issues.apache.org/jira/browse/FLINK-23725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17403698#comment-17403698
 ] 

todd commented on FLINK-23725:
------------------------------

[~sewen]       

The target part file already exists. If you restore from the old ck, the old 
part file will be used. Because the rename operation will only return true or 
fasle, and will not overwrite the old file content. If the task is restored 
from an earlier state snapshot, it is difficult to ensure data consistency.   

For example, the current part-0-10 file is generated by ck-10, and my task is 
restored from ck-0-5. The inprogress file generated by the new task will not be 
renamed to the final part file, because it already exists in the current 
directory. Part files that are reserved at one time.

> HadoopFsCommitter, file rename failure
> --------------------------------------
>
>                 Key: FLINK-23725
>                 URL: https://issues.apache.org/jira/browse/FLINK-23725
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / FileSystem, Connectors / Hadoop 
> Compatibility, FileSystems
>    Affects Versions: 1.11.1, 1.12.1
>            Reporter: todd
>            Priority: Major
>
> When the HDFS file is written, if the part file exists, only false will be 
> returned if the duplicate name fails.Whether to throw an exception that 
> already exists in the part, or print related logs.
>  
> ```
> org.apache.flink.runtime.fs.hdfs.HadoopRecoverableFsDataOutputStream.HadoopFsCommitter#commit
> public void commit() throws IOException {
>  final Path src = recoverable.tempFile();
>  final Path dest = recoverable.targetFile();
>  final long expectedLength = recoverable.offset();
>  try {
>      //always   return false or ture
>     fs.rename(src, dest);
>  } catch (IOException e) {
>  throw new IOException(
>  "Committing file by rename failed: " + src + " to " + dest, e);
>  }
> }
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to