[ 
https://issues.apache.org/jira/browse/HDFS-6075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932300#comment-13932300
 ] 

Ravi Prakash commented on HDFS-6075:
------------------------------------

I feel another option to do this would be to disable replication on a set of 
*nodes* temporarily (not cluster wide). i.e. the list of nodes, and a timeout 
after which replications should be done.

> Introducing "non-replication mode"
> ----------------------------------
>
>                 Key: HDFS-6075
>                 URL: https://issues.apache.org/jira/browse/HDFS-6075
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: datanode, namenode
>            Reporter: Adam Kawa
>            Priority: Minor
>
> Afaik, HDFS does not provide an easy way to temporarily disable the 
> replication of missing blocks.
> If you would like to temporarily disable the replication, you would have to
> * set dfs.namenode.replication.interval (_The periodicity in seconds with 
> which the namenode computes repliaction work for datanodes_ Default 3) to 
> something very high. *Disadvantage*: you have to restart the NN
> * go into the safe-mode. *Disadvantage*: all write operations will fail
> We have the situation that we need to replace our top-of-rack switches for 
> each rack. Replacing a switch should take around 30 minutes. Each rack has 
> around 0.6 PB of data. We would like to avoid an expensive replication, since 
> we know that we will put this rack online quickly. To avoid any downtime, or 
> excessive network transfer, we think that temporarily disabling the 
> replication could fit us.
> The default block placement policy puts blocks into two racks, so when one 
> rack temporarily goes offline, we still have an access to at least replica of 
> each block. Of course, if we lose this replica, then we would have to wait 
> until the rack goes back online. This is what the administrator should be 
> aware of.
> This feature could disable the replication
> * globally - for a whole cluster
> * partially - e.g. only for missing blocks that come from a specified set of 
> DataNodes. So a file like "we_will_be_back_soon" :) could be introduced, 
> similar to include and exclude.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to