[ https://issues.apache.org/jira/browse/HADOOP-15086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16276796#comment-16276796 ]
Steve Loughran commented on HADOOP-15086: ----------------------------------------- I don't disagree with you about the existence of the problem, just don't think it's easily fixed. Essentially: blobstores tend not to have a rename() (or indeed: create(overwrite=false), delete(directory), and the things we do to mimic this in our connectors aren't atomic 1. We cover this in [Object Stores|https://hado op.apache.org/docs/stable/hadoop-project-dist/hadoop-common/filesystem/introduction.html#Object_Stores_vs._Filesystems] 2. This is also common to: S3x, Swift, OSS, ADL, ... 3. By inference, the Hadoop FileOutputCommit protocol is not atomic on object stores either. 4. Compare with the requirements of rename() as covered in [rename()|https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/filesystem/filesystem.html#boolean_renamePath_src_Path_d] There is actually special support in Azure for atomic rename of HBase directories; this is done with leasing, recovery and stuff. It manages exclusivity, but it is still not an O(1) operation. If you look at where we are going with this, the work is in moving to object-store specific committers which provide the commit semantics without relying on renames. HADOOP-13786 is the initial implementation of this for S3A, but the hooks put into FileOutputFormat are designed to support filesystem-specific committers for any store which implements one. I'm closing as a WONTFIX. Sorry. It's not that we don't want to, it's just directory operations are where the metaphor "object stores are like filesystems" fail if you look closely enough. (On a brighter note: wasb is consistent of both metadata and data) > NativeAzureFileSystem.rename is not atomic > ------------------------------------------ > > Key: HADOOP-15086 > URL: https://issues.apache.org/jira/browse/HADOOP-15086 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure > Affects Versions: 2.7.3 > Reporter: Shixiong Zhu > Attachments: RenameReproducer.java > > > When multiple threads rename files to the same target path, more than 1 > threads can succeed. It's because check and copy file in `rename` is not > atomic. > I would expect it's atomic just like HDFS. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org