[ 
https://issues.apache.org/jira/browse/HDFS-1186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12882367#action_12882367
 ] 

sam rash commented on HDFS-1186:
--------------------------------

yea, i think so.  let me repeat slightly different to make sure I get this at a 
higher level:

1. we make sure that a lease recovery that starts with a old gs at one stage 
(that's synchronized) actually mutates the block data of only the same gs
2. new writer that come in between start of recovery and actual stamping must 
have a new gs since they can only come into being via lease recovery

this is effectively saying that if concurrent lease recoveries get started, the 
first to complete wins (as it should), and later completions just fail.

sounds like optimistic locking/versioned puts in the cache world actually: 
updateBlock requires the source to match an expected source.

nice idea




> 0.20: DNs should interrupt writers at start of recovery
> -------------------------------------------------------
>
>                 Key: HDFS-1186
>                 URL: https://issues.apache.org/jira/browse/HDFS-1186
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node
>    Affects Versions: 0.20-append
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Blocker
>         Attachments: hdfs-1186.txt
>
>
> When block recovery starts (eg due to NN recovering lease) it needs to 
> interrupt any writers currently writing to those blocks. Otherwise, an old 
> writer (who hasn't realized he lost his lease) can continue to write+sync to 
> the blocks, and thus recovery ends up truncating data that has been sync()ed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to