[ https://issues.apache.org/jira/browse/HDFS-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089883#comment-13089883 ]
Todd Lipcon commented on HDFS-1108: ----------------------------------- Hi Konstantin. I and several others am working on approach #2, initially -- one in which the edit logs are accessible by both the active and standby NameNodes. Initially based on a NAS device, but potentially also supporting other shared storage such as BookKeeper. My understanding of the HA framework is that we are aiming to build it in such a way that neither approach is _prevented_ - but obviously not all parts of one approach are necessary for the other approach. In this case, this patch is useful for proposal #2 but does not prevent approach #1. Given the APIs available in today's BackupNode, this patch is also necessary for approach #1. What you are describing is a potential optimization available in approach #1 that is not available in approach #2 - and like you said, it's a nice advantage. But without this patch, _neither_ approach can be correct. To put it another way, there are three options: 1) leave the code as it is now. That causes a potential dataloss in both approaches. 2) commit this patch. This fixes a potential dataloss in both approaches but has a small performance impact. 3) commit a better version of this patch which also fixes dataloss but also provides a more optimized code path for the BackupNode. Given that the code does not exist for option 3 above, and option 1 causes dataloss regardless of your opinions on HA, we should go with option 2 above. Keep in mind that, even outside the scope of HA, this fixes a potential bug with NameNode restart. If you have a small cluster that can restart quickly, this patch might allow the NN to restart within the retry window of a client, allowing clients to more easily ride over such restarts. Without the patch, any files in progress will be corrupted. > Log newly allocated blocks > -------------------------- > > Key: HDFS-1108 > URL: https://issues.apache.org/jira/browse/HDFS-1108 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node > Reporter: dhruba borthakur > Assignee: Todd Lipcon > Fix For: HA branch (HDFS-1623) > > Attachments: HDFS-1108.patch, hdfs-1108-habranch.txt, hdfs-1108.txt > > > The current HDFS design says that newly allocated blocks for a file are not > persisted in the NN transaction log when the block is allocated. Instead, a > hflush() or a close() on the file persists the blocks into the transaction > log. It would be nice if we can immediately persist newly allocated blocks > (as soon as they are allocated) for specific files. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira