[ 
https://issues.apache.org/jira/browse/HDFS-6673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292545#comment-14292545
 ] 

Haohui Mai commented on HDFS-6673:
----------------------------------

bq. I think that the point that Andrew is trying to make is that this tool will 
run quickly on machines with more memory, while still being possible to use on 
machines with less memory.

That's great. The concern that I have is once the LevelDB is bigger than the 
working set, it requires one seek per inode. It will trash the system at some 
size of fsimage (which heavily depends on the system that runs the oiv). As 
many oiv tools that Hadoop have today would like to print out the full path, I 
would like to set the architecture right and to make sure the issue is 
addressed.

bq.  Eventually we will probably want to drop support entirely, perhaps in 
Hadoop 3.0. There is a maintenance burden associated with maintaining two image 
formats.

Agree. I retired the old format in HDFS-6158 and it was revived in HDFS-6293. 
The main requirements of oiv are:

* The OIV can print out the full path for an inode
* The OIV can run on commodity machines like a laptop even for the largest 
fsimage in production
* The reads of the fsimage needs to be disk-friendly, meaning that the number 
of seeks are minimized. 

There are two practical solutions that I can see so far:

* Convert the fsimage into LevelDB before running the oiv
* Tweak saver of the pb-based fsimage so that it stores the inodes using with 
the order of the full path. It can be done without changing the format of the 
current fsimage.

Maybe we can explore these solutions?

> Add Delimited format supports for PB OIV tool
> ---------------------------------------------
>
>                 Key: HDFS-6673
>                 URL: https://issues.apache.org/jira/browse/HDFS-6673
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: 2.4.0
>            Reporter: Lei (Eddy) Xu
>            Assignee: Lei (Eddy) Xu
>            Priority: Minor
>         Attachments: HDFS-6673.000.patch, HDFS-6673.001.patch, 
> HDFS-6673.002.patch, HDFS-6673.003.patch, HDFS-6673.004.patch, 
> HDFS-6673.005.patch, HDFS-6673.006.patch
>
>
> The new oiv tool, which is designed for Protobuf fsimage, lacks a few 
> features supported in the old {{oiv}} tool. 
> This task adds supports of _Delimited_ processor to the oiv tool. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to