[ https://issues.apache.org/jira/browse/HADOOP-6240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756540#action_12756540 ]
dhruba borthakur commented on HADOOP-6240: ------------------------------------------ There are many apps that I have built that depends on atomic renames from files in HDFS as well as other filesystems. This is a primitive that many many applications depend upon. In future we can state that rename of a symlink is atomic but given the fact that symlink is not yet there, what primitive will I use for my application now to ensure atomicity? For example, suppose one renames /file1 to /file2 and file2 already existed before the rename. In the absence of atomicity, the above call can result in the following scenarios: 1. file1 is completely lost. file2 remains the same as it was before the rename. 2. file2 is deleted but file1 remains as it is 3. file1 remains as it is. file2's content is replaced by the contents of file1. All the above scenarios are bad, especially the first one. An application has to develop plenty of tricky things to recover from the above scenarios. Maybe we can write this tricky code (only once vs every app doing it by themsleves) inside HDFS even if namenode is distributed. If we do not make rename atomics, it feels like we are punting a hard problem that need to be solved by many applications by themselves. If atomic-renames is a performance concern in the distributed namenode scenario, we can introduce a parameter to the rename call to allow applications that do not need atomic-renames to avoid the performance penalty. > Rename operation is not consistent between different implementations of > FileSystem > ---------------------------------------------------------------------------------- > > Key: HADOOP-6240 > URL: https://issues.apache.org/jira/browse/HADOOP-6240 > Project: Hadoop Common > Issue Type: Bug > Components: fs > Reporter: Suresh Srinivas > Assignee: Suresh Srinivas > Fix For: 0.21.0 > > > The rename operation has many scenarios that are not consistently implemented > across file systems. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.