[ 
https://issues.apache.org/jira/browse/HDFS-7928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14360412#comment-14360412
 ] 

Daryn Sharp commented on HDFS-7928:
-----------------------------------

# Expiry shouldn't be hardcoded
# I think the DN starting up based on its new conf should decide if the file is 
stale, not its predecessor that wrote the file
# Maybe use the timestamp of the file rather than an expiry in the file
# An exception while writing the file should delete it, not just log and 
return, to avoid a corrupt partial file
# Need to close the file rather than flush it
# The file should be written to a temporary file and then renamed to the final 
name since the DN may crash on shutdown, again leading to partial corrupt file
# There doesn't appear to be a sanity check upon reading that the file is 
complete
# Should be something akin to a version number in the file

The patch would probably be much simpler if it was based on HDFS-7435.  It 
could call getBlocksReports, and then dump the PB encoded reports.  Decoding 
the PB will fail if its malformed or incomplete.  Dumping the BRs will allow us 
to preserve extra bits in the replica type field, for instance the sticky block 
bit we intend to add, w/o incompatibilities.  Bonus is the DN is also ready to 
send its reports.


> Scanning blocks from disk during rolling upgrade startup takes a lot of time 
> if disks are busy
> ----------------------------------------------------------------------------------------------
>
>                 Key: HDFS-7928
>                 URL: https://issues.apache.org/jira/browse/HDFS-7928
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>    Affects Versions: 2.6.0
>            Reporter: Rushabh S Shah
>            Assignee: Rushabh S Shah
>         Attachments: HDFS-7928.patch
>
>
> We observed this issue in rolling upgrade to 2.6.x on one of our cluster.
> One of the disks was very busy and it took long time to scan that disk 
> compared to other disks.
> Seeing the sar (System Activity Reporter) data we saw that the particular 
> disk was very busy performing IO operations.
> Requesting for an improvement during datanode rolling upgrade.
> During shutdown, we can persist the whole volume map on the disk and let the 
> datanode read that file and create the volume map during startup  after 
> rolling upgrade.
> This will not require the datanode process to scan all the disk and read the 
> block.
> This will significantly improve the datanode startup time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to