[ https://issues.apache.org/jira/browse/HDFS-2718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13196647#comment-13196647 ]
Konstantin Shvachko commented on HDFS-2718: ------------------------------------------- Thanks guys for the reviews. JP> it should be fine as the existing lease will just be renewed, please confirm. Correct. JP> If blocks are null diskspace will be zero. In the existing code it is -1 (UNKNOWN_DISK_SPACE). Good point. Just realized that explicit calculation of diskspace is not necessary here at all, because it is calculated in addNode() if childDiskspace < 0. Removing diskspace calculation completely. JP> COMMITTED code was removed from convertToCompleteBlock and moved to the caller. COMMITTED state is a confirmation from client that it finished writing the block. During edits loading block will never be in committed state, since there are no clients. So we force it to COMPLETE state. And checking if the block is COMMITTED should be performed only if block completion is not forced. Therefore had to move verification to the caller. ATM> do you have any numbers on the performance gain of edits loading attained from this change? I created 500,000 files using CreateEditsLog utility. And then started NameNode before and after applying the patch. The results show just over 20% improvement. I expect that with addBlock transactions in place the gain will be higher. {code} Before 12/01/27 16:05:21 INFO common.Storage: Edits file /hadoop-data/hdfs/name/current/edits of size 286856143 edits # 1005001 loaded in 18 seconds. After 12/01/27 16:21:17 INFO common.Storage: Edits file /hadoop-data/hdfs/name/current/edits of size 203356143 edits # 1005001 loaded in 14 seconds. {code} As a side effect it turned out that CreateEditsLog generates non-standard transactions. In real life first transaction that creates a file does not contain blocks. While CreateEditsLog adds blocks to this transaction. I had to introduced additional constructor for INodeFileUnderConstruction in order to cover this use case. Attaching files shortly. > Optimize OP_ADD in edits loading > -------------------------------- > > Key: HDFS-2718 > URL: https://issues.apache.org/jira/browse/HDFS-2718 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node > Affects Versions: 0.22.0, 0.24.0, 1.0.0 > Reporter: Konstantin Shvachko > Assignee: Konstantin Shvachko > Attachments: editsLoader-0.22.patch, editsLoader-trunk.patch, > editsLoader-trunk.patch > > > During loading the edits journal FSEditLog.loadEditRecords() processes OP_ADD > inefficiently. It first removes the existing INodeFile from the directory > tree, then adds it back as a regular INodeFile, and then replaces it with > INodeFileUnderConstruction if files is not closed. This slows down edits > loading. OP_ADD should be done in one shot and retain previously existing > data. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira