[ 
https://issues.apache.org/jira/browse/HDFS-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089883#comment-13089883
 ] 

Todd Lipcon commented on HDFS-1108:
-----------------------------------

Hi Konstantin. I and several others am working on approach #2, initially -- one 
in which the edit logs are accessible by both the active and standby NameNodes. 
Initially based on a NAS device, but potentially also supporting other shared 
storage such as BookKeeper.

My understanding of the HA framework is that we are aiming to build it in such 
a way that neither approach is _prevented_ - but obviously not all parts of one 
approach are necessary for the other approach. In this case, this patch is 
useful for proposal #2 but does not prevent approach #1. Given the APIs 
available in today's BackupNode, this patch is also necessary for approach #1.

What you are describing is a potential optimization available in approach #1 
that is not available in approach #2 - and like you said, it's a nice 
advantage. But without this patch, _neither_ approach can be correct.

To put it another way, there are three options:
1) leave the code as it is now. That causes a potential dataloss in both 
approaches.
2) commit this patch. This fixes a potential dataloss in both approaches but 
has a small performance impact.
3) commit a better version of this patch which also fixes dataloss but also 
provides a more optimized code path for the BackupNode.

Given that the code does not exist for option 3 above, and option 1 causes 
dataloss regardless of your opinions on HA, we should go with option 2 above.

Keep in mind that, even outside the scope of HA, this fixes a potential bug 
with NameNode restart. If you have a small cluster that can restart quickly, 
this patch might allow the NN to restart within the retry window of a client, 
allowing clients to more easily ride over such restarts. Without the patch, any 
files in progress will be corrupted.

> Log newly allocated blocks
> --------------------------
>
>                 Key: HDFS-1108
>                 URL: https://issues.apache.org/jira/browse/HDFS-1108
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: name-node
>            Reporter: dhruba borthakur
>            Assignee: Todd Lipcon
>             Fix For: HA branch (HDFS-1623)
>
>         Attachments: HDFS-1108.patch, hdfs-1108-habranch.txt, hdfs-1108.txt
>
>
> The current HDFS design says that newly allocated blocks for a file are not 
> persisted in the NN transaction log when the block is allocated. Instead, a 
> hflush() or a close() on the file persists the blocks into the transaction 
> log. It would be nice if we can immediately persist newly allocated blocks 
> (as soon as they are allocated) for specific files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to