[ 
https://issues.apache.org/jira/browse/HDFS-6133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14326120#comment-14326120
 ] 

Yongjun Zhang commented on HDFS-6133:
-------------------------------------

Hi [~szetszwo],

Thanks for your comment about the possible performance optimization, it's 
amazing that you even tried a test program, thanks for doing it! Yes, your 
result is pretty good. I agree that we don't have to do the optimization for 
now. I was just thinking about avoiding unnecessary computation if possible, 
especially for busy node. 

BTW, 
{quote}
In many clusters, I heard that on average a file only has 1.x blocks.
{quote}
I'm surprised to hear this statistics but good to know. I always had the 
impression that we deal with relatively small number of big files, rather than 
many small files.

Thanks.



> Make Balancer support exclude specified path
> --------------------------------------------
>
>                 Key: HDFS-6133
>                 URL: https://issues.apache.org/jira/browse/HDFS-6133
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: balancer & mover, datanode
>            Reporter: zhaoyunjiong
>            Assignee: zhaoyunjiong
>             Fix For: 2.7.0
>
>         Attachments: HDFS-6133-1.patch, HDFS-6133-10.patch, 
> HDFS-6133-11.patch, HDFS-6133-2.patch, HDFS-6133-3.patch, HDFS-6133-4.patch, 
> HDFS-6133-5.patch, HDFS-6133-6.patch, HDFS-6133-7.patch, HDFS-6133-8.patch, 
> HDFS-6133-9.patch, HDFS-6133.patch
>
>
> Currently, run Balancer will destroying Regionserver's data locality.
> If getBlocks could exclude blocks belongs to files which have specific path 
> prefix, like "/hbase", then we can run Balancer without destroying 
> Regionserver's data locality.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to