[ https://issues.apache.org/jira/browse/HDFS-6931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14113329#comment-14113329 ]
Arpit Agarwal edited comment on HDFS-6931 at 8/28/14 4:46 AM: -------------------------------------------------------------- On restart each volume will be scanned and replicas under {{lazyPersist/}} will be moved to their corresponding locations under {{finalized/}}. We may end up with two replicas of the same block on different volumes, so we use the following scheme to decide which replica to keep. # Prefer the replica with the higher generation stamp. # If generation stamps are equal, prefer the replica with the larger on-disk length. # If on-disk length is the same, prefer the replica on persistent storage volume. # All other factors being equal, keep replica1. The other replica is removed from the volumeMap and is deleted from its storage volume. See {{BlockPoolSlice.resolveDuplicateReplicas}}. Thus: # If a replica is found on both RAM disk and in lazyPersist/, delete the copy on RAM disk and move the lazyPersist/ copy to finalized/. This will be common when a DN is restarted. # If a replica is found on RAM disk but not in lazyPersist/, keep the copy on RAM disk and schedule a copy to disk via LazyWriter. This can occur if the DN process restarted before the replica could be saved to disk but RAM disk contents are not lost. # If a replica is found in lazyPersist/ but not on RAM disk, save the lazyPersist/ copy to finalized. This can occur on node restart when RAM disk contents are lost. Also added test cases. was (Author: arpitagarwal): On restart each volume will be scanned and replicas under {{lazyPersist/}} will be moved to their corresponding locations under {{finalized/}}. We may end up with two replicas of the same block on different volumes, so we use the following scheme to decide which replica to keep. # Prefer the replica with the higher generation stamp. # If generation stamps are equal, prefer the replica with the larger on-disk length. # If on-disk length is the same, prefer the replica on persistent storage volume. # All other factors being equal, keep replica1. The other replica is removed from the volumeMap and is deleted from its storage volume. See {{BlockPoolSlice.resolveDuplicateReplicas}}. Also added test cases. > Move lazily persisted replicas to finalized directory on DN startup > ------------------------------------------------------------------- > > Key: HDFS-6931 > URL: https://issues.apache.org/jira/browse/HDFS-6931 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode > Reporter: Arpit Agarwal > Assignee: Arpit Agarwal > Fix For: HDFS-6581 > > Attachments: HDFS-6931.01.patch > > > On restart the DN should move replicas from the {{current/lazyPersist/}} > directory to {{current/finalized}}. Duplicate replicas of the same block > should be deleted from RAM disk. -- This message was sent by Atlassian JIRA (v6.2#6252)