[ 
https://issues.apache.org/jira/browse/HADOOP-4764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653956#action_12653956
 ] 

Konstantin Shvachko commented on HADOOP-4764:
---------------------------------------------

Ruyue Ma,
# Currently recursive setReplication() for a directory is handled on the client 
side. The name-node can change replication only for individual files. So I was 
proposing to optimize that by handling recursive replications on the name-node 
itself.
# I do not understand your proposal to change only the edits file format. Does 
that mean fsimage will not store replication for directories? If so then after 
second name-node restart you loose all replications for current directories, 
right?
# I did not get exactly what is the semantics of directory replication: does it 
apply to direct offsprings only or the whole subtree?

Zheng Shao,
Which part of the source code will be responsible for reading the dummy file 
and retrieving the default replication? If it is done on the application level, 
I am fine with it. If the name-node will have to create an extra file for every 
directory then I am worried a lot.

Adding additional fields to file/directory inodes or creating additional 
"system" files increases the memory footprint of the namespace (will require 
larger heap size to store the same number of files). With directories it is not 
so critical, because there is not so many of them, but still should be 
justified.

Based on my estimation of required code changes I should mention that what you 
propose (introducing replications for directories) is a new feature, and should 
go through all the steps required for new features, which include
- motivation;
- design document;
- implementation (patch);
- test planning, testing, and support.

So if you feel like you have enough bandwidth, enthusiasm, etc to do that, lets 
start with the design proposal.

> add replication factor for hdfs directory
> -----------------------------------------
>
>                 Key: HADOOP-4764
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4764
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Ruyue Ma
>            Assignee: Ruyue Ma
>
> If we can set replication factor for directory. we can modify the 
> DFSClent.create() method, pass 0 for the default block replication. Namenode 
> check create request, if blockreplication is 0, it will give its parent dir 
> replication factor to the file blockreplication factor. This will simplify 
> the administration work. You know we can set /Test or /Tmp dir's replication 
> factor 2 or 1, then all their children files and dirs replication factor is 2 
> or 1 defaultly.
>    

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to