[jira] [Commented] (HDFS-7787) Wrong priorty of replication

2015-02-13 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14321144#comment-14321144
 ] 

Ravi Prakash commented on HDFS-7787:


Thanks Frode!
Could you please verify that the block was indeed replicated *from* the 
decommissioning node; i.e. did you see such a message in the logs? Or are you 
inferring that only from the "number of underreplicated blocks" and "blocks 
with no live replicas" . e.g. I can think another of another explanation.
Lets say there are nodeX, nodeY and nodeZ and you have decommissioned nodeZ. 
Let's say blockA is on nodeX and nodeZ. This would count it under "number of 
underreplicated blocks". Since the replication work is calculated per-datanode, 
maybe blockA was replicated *from* nodeX to (say) nodeY. Thus the count for 
"number of underreplicated blocks" would go down before "blocks with no live 
replicas" . In the meantime blockB, blockC etc. which were present only on 
nodeZ were being replicated (but since they are only on this 1 node, their 
counts will decrease slower). This is obviously assuming that nodeX doesn't 
have any blocks which fall under "blocks with no live replicas" which could 
have been the case.

> Wrong priorty of replication
> 
>
> Key: HDFS-7787
> URL: https://issues.apache.org/jira/browse/HDFS-7787
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.6.0
> Environment: 2 namenodes HA, 6 datanodes in two racks
>Reporter: Frode Halvorsen
>  Labels: balance, hdfs, replication-performance
>
> Each file has a setting of 3 replicas. split on different racks.
> After a simulated crash of one rack (shutdown of all nodes, deleted 
> data-directory an started nodes) and decommssion of one of the nodes in the 
> orther rack the replication does not follow 'normal' rules...
> My cluster has appx 25 mill files, and the one node I now try to decommision 
> has 9 millions underreplicated blocks, and 3,5 million blocks with 'no live 
> replicas'. After a restart of the node, it starts to replicate both types of 
> blocks, but after a while, it only repliates under-replicated blocks with 
> other live copies. I would think that the 'normal' way to do this would be to 
> make sure that all blocks this node keeps the only copy of, should be the 
> first to be replicated/balanced ?  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7787) Wrong priorty of replication

2015-02-13 Thread Frode Halvorsen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320169#comment-14320169
 ] 

Frode Halvorsen commented on HDFS-7787:
---

Actually, I had the situation that the node under decommission was told to 
replicate blocks with live copies before blocks without live copies. That was 
my 'problem'.  
The node under decommission reports two numbers;
"number of underreplicated blocks" and "blocks with no live replicas" and my 
experience was that the number of underreplicated blocks was reduced while the 
number of 'no live replicas' remained the same. If I restarted the data-node, 
it started to replicate the most importent blocks again, but after a while, if 
only replicated the underreplicated blocks again.

But yes ; It would also be nice to have a priority for '1 live replicas' over 
'2 live replicas' after all of the '0 live replicas' queue was empty :)  
I have not looked into how to contribute yet, so I won't assign this one to me 
just now :)  I'm just learning to use this in the proper way, and still I think 
I have a few issues in my setup that I need to resolve before starting to code 
:)

> Wrong priorty of replication
> 
>
> Key: HDFS-7787
> URL: https://issues.apache.org/jira/browse/HDFS-7787
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.6.0
> Environment: 2 namenodes HA, 6 datanodes in two racks
>Reporter: Frode Halvorsen
>  Labels: balance, hdfs, replication-performance
>
> Each file has a setting of 3 replicas. split on different racks.
> After a simulated crash of one rack (shutdown of all nodes, deleted 
> data-directory an started nodes) and decommssion of one of the nodes in the 
> orther rack the replication does not follow 'normal' rules...
> My cluster has appx 25 mill files, and the one node I now try to decommision 
> has 9 millions underreplicated blocks, and 3,5 million blocks with 'no live 
> replicas'. After a restart of the node, it starts to replicate both types of 
> blocks, but after a while, it only repliates under-replicated blocks with 
> other live copies. I would think that the 'normal' way to do this would be to 
> make sure that all blocks this node keeps the only copy of, should be the 
> first to be replicated/balanced ?  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7787) Wrong priorty of replication

2015-02-12 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318737#comment-14318737
 ] 

Ravi Prakash commented on HDFS-7787:


By the way, here are guidelines for contributing:
https://wiki.apache.org/hadoop/HowToContribute
Please assign the JIRA to yourself in case you intend to work on it.

> Wrong priorty of replication
> 
>
> Key: HDFS-7787
> URL: https://issues.apache.org/jira/browse/HDFS-7787
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.6.0
> Environment: 2 namenodes HA, 6 datanodes in two racks
>Reporter: Frode Halvorsen
>  Labels: balance, hdfs, replication-performance
>
> Each file has a setting of 3 replicas. split on different racks.
> After a simulated crash of one rack (shutdown of all nodes, deleted 
> data-directory an started nodes) and decommssion of one of the nodes in the 
> orther rack the replication does not follow 'normal' rules...
> My cluster has appx 25 mill files, and the one node I now try to decommision 
> has 9 millions underreplicated blocks, and 3,5 million blocks with 'no live 
> replicas'. After a restart of the node, it starts to replicate both types of 
> blocks, but after a while, it only repliates under-replicated blocks with 
> other live copies. I would think that the 'normal' way to do this would be to 
> make sure that all blocks this node keeps the only copy of, should be the 
> first to be replicated/balanced ?  Another thing, is that this takes 
> 'forever'. The rate it's going now it will run for a couple of months before 
> I can take down the node for maintance.. It only has appx 250 G of data in 
> total .. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7787) Wrong priorty of replication

2015-02-12 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318723#comment-14318723
 ] 

Ravi Prakash commented on HDFS-7787:


Frode!
The code for prioritizing under-replicated blocks is here: 
https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java#L149
{noformat}
  {@link #QUEUE_HIGHEST_PRIORITY}: the blocks that must be replicated
 *   first. That is blocks with only one copy, or blocks with zero live
 *   copies but a copy in a node being decommissioned. These blocks
 *   are at risk of loss if the disk or server on which they
 *   remain fails.
{noformat}
It seems you want to split QUEUE_HIGHEST_PRIORITY into two queues: one for 
"That is blocks with only one copy" and a more important "blocks with zero live 
copies but a copy in a node being decommissioned" . This seems reasonable to 
me. Please see if you can submit a patch. It'd be much appreciated.

You can change the rate of re-replication with parameters: Please look at 
dfs.namenode.replication.interval , 
dfs.namenode.replication.work.multiplier.per.iteration etc. Could you please 
remove that point from the description of the JIRA?







> Wrong priorty of replication
> 
>
> Key: HDFS-7787
> URL: https://issues.apache.org/jira/browse/HDFS-7787
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.6.0
> Environment: 2 namenodes HA, 6 datanodes in two racks
>Reporter: Frode Halvorsen
>  Labels: balance, hdfs, replication-performance
>
> Each file has a setting of 3 replicas. split on different racks.
> After a simulated crash of one rack (shutdown of all nodes, deleted 
> data-directory an started nodes) and decommssion of one of the nodes in the 
> orther rack the replication does not follow 'normal' rules...
> My cluster has appx 25 mill files, and the one node I now try to decommision 
> has 9 millions underreplicated blocks, and 3,5 million blocks with 'no live 
> replicas'. After a restart of the node, it starts to replicate both types of 
> blocks, but after a while, it only repliates under-replicated blocks with 
> other live copies. I would think that the 'normal' way to do this would be to 
> make sure that all blocks this node keeps the only copy of, should be the 
> first to be replicated/balanced ?  Another thing, is that this takes 
> 'forever'. The rate it's going now it will run for a couple of months before 
> I can take down the node for maintance.. It only has appx 250 G of data in 
> total .. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)