[ 
https://issues.apache.org/jira/browse/HDFS-15493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17168635#comment-17168635
 ] 

Chengwei Wang edited comment on HDFS-15493 at 7/31/20, 11:06 AM:
-----------------------------------------------------------------

After reviewed code about update blocks map and name cache carefully,I found 
that it's feasible to start to do these  when started loading INodeSection, and 
shutdown the executors when completed loading INodeDirectorySection. So that, 
it taken almost no time cost to wait executor terminated.

Submit a patch [^HDFS-15493.004.patch]  base on this means.  It uses two single 
thread executors and updates without lock.

Tested this patch twice.
{code:java}
Test1.
    20/07/31 18:27:50 INFO namenode.FSImageFormatPBINode: Completed loading all 
INodeDirectory sub-sections
    20/07/31 18:27:50 INFO namenode.FSImageFormatPBINode: Completed update 
blocks map and name cache, total waiting duration: 1
    20/07/31 18:27:51 INFO namenode.FSImageFormatProtobuf: Loaded FSImage in 
367 seconds.
Test2.
    20/07/31 18:48:03 INFO namenode.FSImageFormatPBINode: Completed loading all 
INodeDirectory sub-sections
    20/07/31 18:48:03 INFO namenode.FSImageFormatPBINode: Completed update 
blocks map and name cache, total waiting duration: 1
    20/07/31 18:48:04 INFO namenode.FSImageFormatProtobuf: Loaded FSImage in 
363 seconds.{code}
It takes about 20% speed up base my tests and reduces the time cost from 460s+ 
to  360s+.

I think this patch may be the best choice, [~sodonnell] can you help me test it 
on trunk.

 


was (Author: smarthan):
After reviewed code about update blocks map and name cache carefully,I found 
that it's feasible to start to do these  when started loading INodeSection, and 
shutdown the executors when completed loading INodeDirectorySection. So that, 
it taken almost no time cost to wait executor terminated.

Submit a patch [^HDFS-15493.004.patch]  base on this means.  It uses two single 
thread executors and updates without lock.

Tested this patch twice.

 
{code:java}
Test1.
    20/07/31 18:27:50 INFO namenode.FSImageFormatPBINode: Completed loading all 
INodeDirectory sub-sections
    20/07/31 18:27:50 INFO namenode.FSImageFormatPBINode: Completed update 
blocks map and name cache, total waiting duration: 1
    20/07/31 18:27:51 INFO namenode.FSImageFormatProtobuf: Loaded FSImage in 
367 seconds.
Test2.
    20/07/31 18:48:03 INFO namenode.FSImageFormatPBINode: Completed loading all 
INodeDirectory sub-sections
    20/07/31 18:48:03 INFO namenode.FSImageFormatPBINode: Completed update 
blocks map and name cache, total waiting duration: 1
    20/07/31 18:48:04 INFO namenode.FSImageFormatProtobuf: Loaded FSImage in 
363 seconds.{code}
It takes about 20% speed up base my tests and reduces the time cost from 460s+ 
to  360s+.

I think this patch may be the best choice, [~sodonnell] can you help me test it 
on trunk.

 

> Update block map and name cache in parallel while loading fsimage.
> ------------------------------------------------------------------
>
>                 Key: HDFS-15493
>                 URL: https://issues.apache.org/jira/browse/HDFS-15493
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Chengwei Wang
>            Priority: Major
>         Attachments: HDFS-15493.001.patch, HDFS-15493.002.patch, 
> HDFS-15493.003.patch, HDFS-15493.004.patch, fsimage-loading.log
>
>
> While loading INodeDirectorySection of fsimage, it will update name cache and 
> block map after added inode file to inode directory. It would reduce time 
> cost of fsimage loading to enable these steps run in parallel.
> In our test case, with patch HDFS-13694 and HDFS-14617, the time cost to load 
> fsimage (220M files & 240M blocks) is 470s, with this patch , the time cost 
> reduc to 410s.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to