[ 
https://issues.apache.org/jira/browse/HDFS-2718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13196647#comment-13196647
 ] 

Konstantin Shvachko commented on HDFS-2718:
-------------------------------------------

Thanks guys for the reviews.

JP> it should be fine as the existing lease will just be renewed, please 
confirm.

Correct.

JP> If blocks are null diskspace will be zero. In the existing code it is -1 
(UNKNOWN_DISK_SPACE).

Good point. Just realized that explicit calculation of diskspace is not 
necessary here at all, because it is calculated in addNode() if childDiskspace 
< 0. Removing diskspace calculation completely.

JP> COMMITTED code was removed from convertToCompleteBlock and moved to the 
caller.

COMMITTED state is a confirmation from client that it finished writing the 
block.
During edits loading block will never be in committed state, since there are no 
clients. So we force it to COMPLETE state. And checking if the block is 
COMMITTED should be performed only if block completion is not forced. Therefore 
had to move verification to the caller.

ATM> do you have any numbers on the performance gain of edits loading attained 
from this change?

I created 500,000 files using CreateEditsLog utility. And then started NameNode 
before and after applying the patch. The results show just over 20% improvement.
I expect that with addBlock transactions in place the gain will be higher.
{code}
Before
12/01/27 16:05:21 INFO common.Storage: Edits file 
/hadoop-data/hdfs/name/current/edits of size 286856143 edits # 1005001 loaded 
in 18 seconds.

After
12/01/27 16:21:17 INFO common.Storage: Edits file 
/hadoop-data/hdfs/name/current/edits of size 203356143 edits # 1005001 loaded 
in 14 seconds.
{code}

As a side effect it turned out that CreateEditsLog generates non-standard 
transactions. In real life first transaction that creates a file does not 
contain blocks. While CreateEditsLog adds blocks to this transaction. I had to 
introduced additional constructor for INodeFileUnderConstruction in order to 
cover this use case.
Attaching files shortly.
                
> Optimize OP_ADD in edits loading
> --------------------------------
>
>                 Key: HDFS-2718
>                 URL: https://issues.apache.org/jira/browse/HDFS-2718
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.22.0, 0.24.0, 1.0.0
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>         Attachments: editsLoader-0.22.patch, editsLoader-trunk.patch, 
> editsLoader-trunk.patch
>
>
> During loading the edits journal FSEditLog.loadEditRecords() processes OP_ADD 
> inefficiently. It first removes the existing INodeFile from the directory 
> tree, then adds it back as a regular INodeFile, and then replaces it with 
> INodeFileUnderConstruction if files is not closed. This slows down edits 
> loading. OP_ADD should be done in one shot and retain previously existing 
> data.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to