[ https://issues.apache.org/jira/browse/HBASE-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123603#comment-13123603 ]
Todd Lipcon commented on HBASE-4553: ------------------------------------ atomic operations on filesystems are tricky... to do this "correctly" in the face of crashes, we need to have some process either do a rollback or roll-forward to recover from failures. Something like: writer: - create tableinfo.tmp - delete tableinfo - rename tableinfo.tmp to tableinfo reader: - try to read tableinfo - on IOE (block missing, etc), that means that the file was deleted underneath. So spin until the file open succeeds. if the writer crashes between the delete and rename, we need someone else to come in and "finish" the operation. IMO we need some general purpose way of allowing the master to keep an "intent log" in ZK for this kind of thing - and then if the master fails over, it can complete the operation. > The update of .tableinfo is not atomic; we remove then rename > ------------------------------------------------------------- > > Key: HBASE-4553 > URL: https://issues.apache.org/jira/browse/HBASE-4553 > Project: HBase > Issue Type: Task > Reporter: stack > > This comes of HBASE-4547. The rename in 0.20 hdfs fails if file exists > already. In 0.20+ its better but still 'some' issues if existing reader when > file is renamed. This issue is about fixing this (though we depend on fix > first being in hdfs). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira