[jira] [Commented] (HDFS-1108) Log newly allocated blocks

Suresh Srinivas (JIRA) Wed, 24 Aug 2011 14:20:54 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090499#comment-13090499
 ]


Suresh Srinivas commented on HDFS-1108:
---------------------------------------

bq. Suresh, based on your comments, you should vote -1 on this patch, because 
this patch calls persistBlocks only when supportAppends or hasHA. So, why not 
enable it everytime, unless a benchmark shows serious regression ?

I do not -1 when the discussion is still in progress :-) I agree we should 
understand the cost of persisting irrespective of HA or not. I also think 
eventually (once 0.23 is stable enough) supportAppends by default could be set 
to true and the optimization may not be effective, in default case.

bq. Suresh, yes you do repeat it. But you never answered MY question, which HA 
approach are you implementing. As you see you have to make choices even with 
issues that seemed to be common part for all approaches.
I thought it was clear. I would like to implement HA with shared approach and 
not dependent on IP failover.

bq. I like Milind's idea about an implementation "without shared storage 
assumption".
If that is the case, then lets also remove BackupNode from the picture, to 
remove the argument, the editlog is persisted in BackupNode's memory.

Removing HA out of the question, isn't not persisting block allocation an issue 
even with new append, in the following scenario:
# A block is allocated on an NN
# Client writes starts writing to a block and performs flush.
# NN at this time restarts and has no idea about this new block. During lease 
recovery, it closes the file (with no UnderConstruction block).

Will the above scenario result in loss of data?



> Log newly allocated blocks
> --------------------------
>
>                 Key: HDFS-1108
>                 URL: https://issues.apache.org/jira/browse/HDFS-1108
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: name-node
>            Reporter: dhruba borthakur
>            Assignee: Todd Lipcon
>             Fix For: HA branch (HDFS-1623)
>
>         Attachments: HDFS-1108.patch, hdfs-1108-habranch.txt, hdfs-1108.txt
>
>
> The current HDFS design says that newly allocated blocks for a file are not 
> persisted in the NN transaction log when the block is allocated. Instead, a 
> hflush() or a close() on the file persists the blocks into the transaction 
> log. It would be nice if we can immediately persist newly allocated blocks 
> (as soon as they are allocated) for specific files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1108) Log newly allocated blocks

Reply via email to