[ https://issues.apache.org/jira/browse/HDFS-4849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13669000#comment-13669000 ]
Konstantin Shvachko commented on HDFS-4849: ------------------------------------------- Steve, you got a good example. Rename is the most "tricky" operation in file systems. And concurrency logic is always an issue in distributed storage. Suppose in current HDFS client1 renames (moves) file A to B when B exists, which should replace B with contents of A. Suppose then that at the same time client2 deletes file B. Since there is no guarantee which operation is executed first you can either end up with A renamed to B, if the delete goes first, or with no files if the rename prevails followed by the deletion of B. This is similar to your case. When client1 retries from its perspective delete was not completed, so it deletes again. And it is not different from the case when client1 is slow and executes delete after rename. Or if there are other clients besides 1 and 2 doing something with /path. My point is that if you need to coordinate clients you should do it with some external tools, like ZK. > Idempotent create, append and delete operations. > ------------------------------------------------ > > Key: HDFS-4849 > URL: https://issues.apache.org/jira/browse/HDFS-4849 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode > Affects Versions: 2.0.4-alpha > Reporter: Konstantin Shvachko > Assignee: Konstantin Shvachko > > create, append and delete operations can be made idempotent. This will reduce > chances for a job or other app failures when NN fails over. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira