Yiqun Lin created HDFS-10778:
--------------------------------

             Summary: Optimize the output result of FileDistribution processor 
in hdfs oiv command
                 Key: HDFS-10778
                 URL: https://issues.apache.org/jira/browse/HDFS-10778
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: tools
    Affects Versions: 2.7.1
            Reporter: Yiqun Lin
            Assignee: Yiqun Lin
            Priority: Minor


Now It's not directly to understand the output result of the 
{{FileDistribution}} processor that in hdfs oiv command for users. For example, 
this is a original output:
{code}
Size    NumFiles
0       22556
1048576 404971
2097152 29259
3145728 16937
4194304 9197
5242880 6889
6291456 4930
7340032 4070
8388608 299384
9437184 274623
{code}
Two aspects make that  hard to understand for users.

First, the size column just showed as the number in byte, it's not readable 
here. The better way is showed with a binary prefix.
Second, the size column would be better to showed as a size range. It will let 
users know the value in {{NumFiles}} column was counted from A size to B size.

The expected output result should be this:
{code}
Size Range   NumFiles
(0 B, 0 B]  1666332
(0, 1 M]        778473
(1 M, 2 M]      35125
(2 M, 3 M]      13978
(3 M, 4 M]      10158
(4 M, 5 M]      6970
{code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to