[ https://issues.apache.org/jira/browse/HDFS-1137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12865522#action_12865522 ]
Benjamin Reed commented on HDFS-1137: ------------------------------------- i think durability guarantees of local file systems are a bit different from lose of local file systems because of how failures happen: if a machine fails, all processes on that machine also fail. In a distributed file system a process can continue running even though the machine hosting the dfs crashes and recovers. you can support nice durability guarantees and even improve performance if you do the WAL correctly: log the request and do batch syncs before even starting to process the request. you can do this without touching the big name node lock. then after the request is synced to disk you process the requests under the big lock in the order that you synced them. this second part is a purely in-memory operation. i think in the end, the performance would be much better, the design cleaner, and the durability guarantee much easier to understand. > Name node is using the write-ahead log improperly > ------------------------------------------------- > > Key: HDFS-1137 > URL: https://issues.apache.org/jira/browse/HDFS-1137 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node > Reporter: Benjamin Reed > > The Name node is doing the write-ahead log (WAL) (aka edit log) improperly. > Usually when using WAL, changes are written to the log before they are > applied to the state. Currently the Namenode does the WAL after applying the > change. This means that read may see changes before they are durable. A > client may read information and the server fail before the information is > written to the WAL, which results in the client reading state that > disappears. To fix the Namenode should write changes before (aka ahead of) > applying the change. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.