[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic

2015-09-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14938370#comment-14938370
 ] 

Hudson commented on HDFS-7621:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2408 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2408/])
HDFS-7621. Erasure Coding: update the Balancer/Mover data migration (zhezhang: 
rev 673280df24f0228bf01777035ceeab8807da8c40)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/proto/hdfs.proto
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES-HDFS-EC-7285.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestMover.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/BlocksWithLocations.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/mover/Mover.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java


> Erasure Coding: update the Balancer/Mover data migration logic
> --
>
> Key: HDFS-7621
> URL: https://issues.apache.org/jira/browse/HDFS-7621
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
>  Labels: HDFS-7285
> Fix For: HDFS-7285
>
> Attachments: HDFS-7621.001.patch, HDFS-7621.002.patch, 
> HDFS-7621.003.patch, HDFS-7621.004.patch, HDFS-7621.005.patch, 
> HDFS-7621.006.patch, HDFS-7621.007.patch
>
>
> Currently the Balancer/Mover only considers the distribution of replicas of 
> the same block during data migration: the migration cannot decrease the 
> number of racks. With EC the Balancer and Mover should also take into account 
> the distribution of blocks belonging to the same block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic

2015-09-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14938564#comment-14938564
 ] 

Hudson commented on HDFS-7621:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #465 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/465/])
HDFS-7621. Erasure Coding: update the Balancer/Mover data migration (zhezhang: 
rev 673280df24f0228bf01777035ceeab8807da8c40)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/BlocksWithLocations.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestMover.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/mover/Mover.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES-HDFS-EC-7285.txt
* hadoop-hdfs-project/hadoop-hdfs/src/main/proto/hdfs.proto


> Erasure Coding: update the Balancer/Mover data migration logic
> --
>
> Key: HDFS-7621
> URL: https://issues.apache.org/jira/browse/HDFS-7621
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
>  Labels: HDFS-7285
> Fix For: HDFS-7285
>
> Attachments: HDFS-7621.001.patch, HDFS-7621.002.patch, 
> HDFS-7621.003.patch, HDFS-7621.004.patch, HDFS-7621.005.patch, 
> HDFS-7621.006.patch, HDFS-7621.007.patch
>
>
> Currently the Balancer/Mover only considers the distribution of replicas of 
> the same block during data migration: the migration cannot decrease the 
> number of racks. With EC the Balancer and Mover should also take into account 
> the distribution of blocks belonging to the same block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic

2015-09-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14937994#comment-14937994
 ] 

Hudson commented on HDFS-7621:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #439 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/439/])
HDFS-7621. Erasure Coding: update the Balancer/Mover data migration (zhezhang: 
rev 673280df24f0228bf01777035ceeab8807da8c40)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/BlocksWithLocations.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/mover/Mover.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestMover.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES-HDFS-EC-7285.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/proto/hdfs.proto
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java


> Erasure Coding: update the Balancer/Mover data migration logic
> --
>
> Key: HDFS-7621
> URL: https://issues.apache.org/jira/browse/HDFS-7621
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
>  Labels: HDFS-7285
> Fix For: HDFS-7285
>
> Attachments: HDFS-7621.001.patch, HDFS-7621.002.patch, 
> HDFS-7621.003.patch, HDFS-7621.004.patch, HDFS-7621.005.patch, 
> HDFS-7621.006.patch, HDFS-7621.007.patch
>
>
> Currently the Balancer/Mover only considers the distribution of replicas of 
> the same block during data migration: the migration cannot decrease the 
> number of racks. With EC the Balancer and Mover should also take into account 
> the distribution of blocks belonging to the same block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic

2015-09-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14937775#comment-14937775
 ] 

Hudson commented on HDFS-7621:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #473 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/473/])
HDFS-7621. Erasure Coding: update the Balancer/Mover data migration (zhezhang: 
rev 673280df24f0228bf01777035ceeab8807da8c40)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/BlocksWithLocations.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES-HDFS-EC-7285.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestMover.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/proto/hdfs.proto
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/mover/Mover.java


> Erasure Coding: update the Balancer/Mover data migration logic
> --
>
> Key: HDFS-7621
> URL: https://issues.apache.org/jira/browse/HDFS-7621
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
>  Labels: HDFS-7285
> Fix For: HDFS-7285
>
> Attachments: HDFS-7621.001.patch, HDFS-7621.002.patch, 
> HDFS-7621.003.patch, HDFS-7621.004.patch, HDFS-7621.005.patch, 
> HDFS-7621.006.patch, HDFS-7621.007.patch
>
>
> Currently the Balancer/Mover only considers the distribution of replicas of 
> the same block during data migration: the migration cannot decrease the 
> number of racks. With EC the Balancer and Mover should also take into account 
> the distribution of blocks belonging to the same block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic

2015-09-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14937518#comment-14937518
 ] 

Hudson commented on HDFS-7621:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1203 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1203/])
HDFS-7621. Erasure Coding: update the Balancer/Mover data migration (zhezhang: 
rev 673280df24f0228bf01777035ceeab8807da8c40)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/BlocksWithLocations.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/proto/hdfs.proto
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/mover/Mover.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES-HDFS-EC-7285.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestMover.java


> Erasure Coding: update the Balancer/Mover data migration logic
> --
>
> Key: HDFS-7621
> URL: https://issues.apache.org/jira/browse/HDFS-7621
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
>  Labels: HDFS-7285
> Fix For: HDFS-7285
>
> Attachments: HDFS-7621.001.patch, HDFS-7621.002.patch, 
> HDFS-7621.003.patch, HDFS-7621.004.patch, HDFS-7621.005.patch, 
> HDFS-7621.006.patch, HDFS-7621.007.patch
>
>
> Currently the Balancer/Mover only considers the distribution of replicas of 
> the same block during data migration: the migration cannot decrease the 
> number of racks. With EC the Balancer and Mover should also take into account 
> the distribution of blocks belonging to the same block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic

2015-09-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14937268#comment-14937268
 ] 

Hudson commented on HDFS-7621:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8548 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8548/])
HDFS-7621. Erasure Coding: update the Balancer/Mover data migration (zhezhang: 
rev 673280df24f0228bf01777035ceeab8807da8c40)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/mover/Mover.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestMover.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/BlocksWithLocations.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES-HDFS-EC-7285.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/proto/hdfs.proto


> Erasure Coding: update the Balancer/Mover data migration logic
> --
>
> Key: HDFS-7621
> URL: https://issues.apache.org/jira/browse/HDFS-7621
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
>  Labels: HDFS-7285
> Fix For: HDFS-7285
>
> Attachments: HDFS-7621.001.patch, HDFS-7621.002.patch, 
> HDFS-7621.003.patch, HDFS-7621.004.patch, HDFS-7621.005.patch, 
> HDFS-7621.006.patch, HDFS-7621.007.patch
>
>
> Currently the Balancer/Mover only considers the distribution of replicas of 
> the same block during data migration: the migration cannot decrease the 
> number of racks. With EC the Balancer and Mover should also take into account 
> the distribution of blocks belonging to the same block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic

2015-09-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14937072#comment-14937072
 ] 

Hudson commented on HDFS-7621:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2379 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2379/])
HDFS-7621. Erasure Coding: update the Balancer/Mover data migration (zhezhang: 
rev 673280df24f0228bf01777035ceeab8807da8c40)
* hadoop-hdfs-project/hadoop-hdfs/src/main/proto/hdfs.proto
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES-HDFS-EC-7285.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/mover/TestMover.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/mover/Mover.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/protocol/BlocksWithLocations.java


> Erasure Coding: update the Balancer/Mover data migration logic
> --
>
> Key: HDFS-7621
> URL: https://issues.apache.org/jira/browse/HDFS-7621
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
>  Labels: HDFS-7285
> Fix For: HDFS-7285
>
> Attachments: HDFS-7621.001.patch, HDFS-7621.002.patch, 
> HDFS-7621.003.patch, HDFS-7621.004.patch, HDFS-7621.005.patch, 
> HDFS-7621.006.patch, HDFS-7621.007.patch
>
>
> Currently the Balancer/Mover only considers the distribution of replicas of 
> the same block during data migration: the migration cannot decrease the 
> number of racks. With EC the Balancer and Mover should also take into account 
> the distribution of blocks belonging to the same block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic

2015-05-28 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14564236#comment-14564236
 ] 

Zhe Zhang commented on HDFS-7621:
-

Thanks Walter for the discussion.

bq. Changes like getBlockList() and markMovedIfGoodBlock need be done anyway
This is a good point. We do need to adjust the {{bytesReceived}} in 
{{getBlockList}}. Not so sure about {{markMovedIfGoodBlock}}.

I reviewed the patch again; seems to me it does the following. Please find my 
comments inline with each item.
# Renames {{PendingMove#block}} to {{reportedBlock}}. This is a clean change 
(+1)
# Creates and handles {{BlocksWithLocations#StripedBlockWithLocations}}. This 
change looks good to me. +1 pending the below minor issues:
#* Please fix indentation below (which caused long line):
{code}
+public StripedBlockWithLocations(BlockWithLocations blk, byte[] indices,
+ short dataBlockNum, short parityBlockNum) 
{
{code}
#* Please fix the indentation below:
{code}
+  BlockWithLocations blkWithLocs = new BlockWithLocations(block,
+  datanodeUuids, storageIDs,
+  storageTypes);
{code}
#* Not necessary to fix in this JIRA, but let's use _internal blocks_ 
consistently, instead of _inner blocks_.
#* The constructor could also use a {{checkArgument}} to make sure the number 
of indices is the same as the number of locations.
#* {{parityBlockNum}} is not used and should be removed.
# Creates and uses {{DBlockStriped}}. Per above discussion, it not only 
translates {{storage}} to internal block index, but also carries 
{{dataBlockNum}} to calculate {{bytesReceived}}. So I agree it's a reasonable 
change. +1 pending the below:
#* Seems like the new {{getBlockList()}} method should use a cleaner if-else 
flow:
{code}
if (blkLocs instanceof StripedBlockWithLocations) {
StripedBlockWithLocations sblkLocs = (StripedBlockWithLocations) 
blkLocs;
xxx
} else {
xxx
}
{code}

Great work!

> Erasure Coding: update the Balancer/Mover data migration logic
> --
>
> Key: HDFS-7621
> URL: https://issues.apache.org/jira/browse/HDFS-7621
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
>  Labels: HDFS-7285
> Attachments: HDFS-7621.001.patch, HDFS-7621.002.patch, 
> HDFS-7621.003.patch, HDFS-7621.004.patch, HDFS-7621.005.patch, 
> HDFS-7621.006.patch
>
>
> Currently the Balancer/Mover only considers the distribution of replicas of 
> the same block during data migration: the migration cannot decrease the 
> number of racks. With EC the Balancer and Mover should also take into account 
> the distribution of blocks belonging to the same block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic

2015-05-28 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562487#comment-14562487
 ] 

Walter Su commented on HDFS-7621:
-

My changes to {{Dispatcher}} is smaller. The patch is big because I rename 
{{PendingMove.block}} to {{reportedBlock}}, and javadoc changes.
{{GlobalBlockMap.putIfAbsent(..)}} is almost the same as the original 
{{get(..)}}. The logic doesn't change.
Changes to {{getBlockList()}} because need to correctly calculate 
{{bytesReceived}}. The logic doesn't change.
{{DBlockStriped}} is incremental.

{{PendingMove}} is used to handle reported block as it used to be, even before 
EC branch came out. most changes are about renaming and javadoc.
The only logic change is here:
{code}
139 @@ -224,7 +226,11 @@ private boolean markMovedIfGoodBlock(DBlock block, 
StorageType targetStorageType
 140synchronized (block) {
 141  synchronized (movedBlocks) {
 142if (isGoodBlockCandidate(source, target, targetStorageType, 
block)) {
 143 -this.block = block;
 144 +if (block instanceof DBlockStriped) {
 145 +  reportedBlock = ((DBlockStriped) 
block).getInnerBlock(source);
 146 +} else {
 147 +  reportedBlock = block;
 148 +}
 149  if (chooseProxySource()) {
 150movedBlocks.put(block);
 151if (LOG.isDebugEnabled()) {
{code}
I'm sure it's really small. 
Changes like {{getBlockList()}} and {{markMovedIfGoodBlock}} need be done 
anyway if we use {{GlobalBlockGroupMap}}
So I insist 006 patch.

> Erasure Coding: update the Balancer/Mover data migration logic
> --
>
> Key: HDFS-7621
> URL: https://issues.apache.org/jira/browse/HDFS-7621
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
>  Labels: HDFS-7285
> Attachments: HDFS-7621.001.patch, HDFS-7621.002.patch, 
> HDFS-7621.003.patch, HDFS-7621.004.patch, HDFS-7621.005.patch, 
> HDFS-7621.006.patch
>
>
> Currently the Balancer/Mover only considers the distribution of replicas of 
> the same block during data migration: the migration cannot decrease the 
> number of racks. With EC the Balancer and Mover should also take into account 
> the distribution of blocks belonging to the same block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic

2015-05-27 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562357#comment-14562357
 ] 

Zhe Zhang commented on HDFS-7621:
-

Sorry for coming to this review late.

Again, the patch looks reasonable overall. My biggest concern is that the patch 
changes {{Dispatcher}} pretty heavily in place. It'd be better if we leave the 
current logic mostly unchanged and add striping logic incrementally. This will 
make the merge review much cleaner and easier.

{{DBlock}} actually just maps a {{Block}} to locations. The added 
{{DBlockStriped}} does 2 things: 1) maps {{Block}} to locations (inherited from 
{{DBlock}}); and 2) when given a {{StorageGroup}}, calculates the index of the 
internal block. How about creating a structure only for purpose #2? Something 
like the below:

{code}
private static class GlobalBlockGroupMap {
  private final Map map = new HashMap<>();
  private int getInternalBlkIdx(Block b, StorageGroup storage) {
...
  }
}
{code}

Then the change to existing {{Dispatcher}} code can be simple. Right before 
sending a move command to DN, just check the new map to see if we need to 
translate the block ID. It seems we can add the check in {{executePendingMove}} 
for this purpose, but I'm not 100% sure.

> Erasure Coding: update the Balancer/Mover data migration logic
> --
>
> Key: HDFS-7621
> URL: https://issues.apache.org/jira/browse/HDFS-7621
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
>  Labels: HDFS-7285
> Attachments: HDFS-7621.001.patch, HDFS-7621.002.patch, 
> HDFS-7621.003.patch, HDFS-7621.004.patch, HDFS-7621.005.patch, 
> HDFS-7621.006.patch
>
>
> Currently the Balancer/Mover only considers the distribution of replicas of 
> the same block during data migration: the migration cannot decrease the 
> number of racks. With EC the Balancer and Mover should also take into account 
> the distribution of blocks belonging to the same block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic

2015-05-20 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14553735#comment-14553735
 ] 

Walter Su commented on HDFS-7621:
-

006 patch changes: 1. rebase; 2. update {{blockManager}} logic

> Erasure Coding: update the Balancer/Mover data migration logic
> --
>
> Key: HDFS-7621
> URL: https://issues.apache.org/jira/browse/HDFS-7621
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
>  Labels: HDFS-7285
> Attachments: HDFS-7621.001.patch, HDFS-7621.002.patch, 
> HDFS-7621.003.patch, HDFS-7621.004.patch, HDFS-7621.005.patch, 
> HDFS-7621.006.patch
>
>
> Currently the Balancer/Mover only considers the distribution of replicas of 
> the same block during data migration: the migration cannot decrease the 
> number of racks. With EC the Balancer and Mover should also take into account 
> the distribution of blocks belonging to the same block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic

2015-05-18 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549291#comment-14549291
 ] 

Zhe Zhang commented on HDFS-7621:
-

Thanks Walter for the update! The main changes ({{DBlockStriped}} and 
{{StripedBlockWithLocations}}) look good to me. I will post a full review soon. 

> Erasure Coding: update the Balancer/Mover data migration logic
> --
>
> Key: HDFS-7621
> URL: https://issues.apache.org/jira/browse/HDFS-7621
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
>  Labels: HDFS-7285
> Attachments: HDFS-7621.001.patch, HDFS-7621.002.patch, 
> HDFS-7621.003.patch, HDFS-7621.004.patch, HDFS-7621.005.patch
>
>
> Currently the Balancer/Mover only considers the distribution of replicas of 
> the same block during data migration: the migration cannot decrease the 
> number of racks. With EC the Balancer and Mover should also take into account 
> the distribution of blocks belonging to the same block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic

2015-05-15 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14545421#comment-14545421
 ] 

Walter Su commented on HDFS-7621:
-

005 patch totally rewrite, except tests. Please review.
hints for reviewing:
1. new class {{StripedBlockWithLocations}}, {{DBlockStriped}} to represent 
blockgroup
2. PendingMove represents a (source, target, reportedBlock) triple. 
The newly created PendingMove is (source, target, null)
{{PendingMove.markMovedIfGoodBlock}} get block/blockgroup from Source, and  
validate block/blockgroup, and parse it, and save reported block to itself.  
After that, PendingMove represents a (source, target, reportedBlock) triple, 
it's final, and will be dispatched.


> Erasure Coding: update the Balancer/Mover data migration logic
> --
>
> Key: HDFS-7621
> URL: https://issues.apache.org/jira/browse/HDFS-7621
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
>  Labels: HDFS-7285
> Attachments: HDFS-7621.001.patch, HDFS-7621.002.patch, 
> HDFS-7621.003.patch, HDFS-7621.004.patch, HDFS-7621.005.patch
>
>
> Currently the Balancer/Mover only considers the distribution of replicas of 
> the same block during data migration: the migration cannot decrease the 
> number of racks. With EC the Balancer and Mover should also take into account 
> the distribution of blocks belonging to the same block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic

2015-05-14 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14544545#comment-14544545
 ] 

Zhe Zhang commented on HDFS-7621:
-

Thanks for the update Walter.

Structural:
# {{convertToBlockWithLocations}} looks good to me know. So I'm +1 on the 
{{BlockManager}} changes
# One concern on the new {{DBlock}} code is that each {{DBlock}} should 
represent a single block with locations. Its super class, 
{{MovedBlocks#Locations}}, is also clearly designed for a single block. 
Therefore {{nonCollocatedBlock}} looks strange because each striped {{DBlock}} 
actually has multiple peer blocks that it has to avoid.
# bq. Balancer handles block and doesn't know file. Balancer gets blocks from 
Node.
Thanks for clarifying!

Nits:
# A few lines are too long. You might already know that we prefer each line to 
be under 80 characters.
{code}
+  updateDBlockLocations(nonCollocatedBlock, 
nonCollocatedBlockWithLocations);
+  DBlock newDBlock(Block block, List locations, DBlock 
stripedBlock) {
+  final ExtendedBlock reportedBlock = new 
ExtendedBlock(lsb.getBlock());
+  long numBytes = 
getInternalBlockLength(lsb.getBlock().getNumBytes(),
+  final List reportedBlockLocation = new  
ArrayList<>(1);
+  public static void verifyLocatedStripedBlocks(LocatedBlocks lbs, int 
groupSize) {
+  client = NameNodeProxies.createProxy(conf, 
cluster.getFileSystem(0).getUri(),
+  LocatedBlocks locatedBlocks = client.getBlockLocations(fileName, 0, 
fileLen);
+  final DBlock db = mover.newDBlock(lb.getBlock().getLocalBlock(), 
locations, null);
+  ClientProtocol client = NameNodeProxies.createProxy(conf, 
cluster.getFileSystem(0).getUri(),
+  client.setStoragePolicy(barDir, 
HdfsServerConstants.HOT_STORAGE_POLICY_NAME);
+  LocatedBlocks locatedBlocks = client.getBlockLocations(fooFile, 0, 
fileLen);
+  DFSTestUtil.verifyLocatedStripedBlocks(locatedBlocks, dataBlocks + 
parityBlocks);
+  DFSTestUtil.verifyLocatedStripedBlocks(locatedBlocks, dataBlocks + 
parityBlocks);
{code}
I usually use a simple script to check for long lines. In case you need:
{code}
#! /usr/local/bin/python3
import sys

with open(sys.argv[1], 'r') as inf:
  for line in inf:
if line.startswith('+ ') and len(line) > 81:
  print(line)
{code}
# Looks like the indentation is wrong:
{code}
+private void updateDBlockLocations(DBlock block,
+   BlockWithLocations blockWithLocations){
{code}
# {{nonCollocatedBlock}} literally means "currently not collocated". Maybe just 
{{toAvoid}}? The name doesn't need to have "block" because the type already 
implies it.


> Erasure Coding: update the Balancer/Mover data migration logic
> --
>
> Key: HDFS-7621
> URL: https://issues.apache.org/jira/browse/HDFS-7621
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
>  Labels: HDFS-7285
> Attachments: HDFS-7621.001.patch, HDFS-7621.002.patch, 
> HDFS-7621.003.patch, HDFS-7621.004.patch
>
>
> Currently the Balancer/Mover only considers the distribution of replicas of 
> the same block during data migration: the migration cannot decrease the 
> number of racks. With EC the Balancer and Mover should also take into account 
> the distribution of blocks belonging to the same block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic

2015-05-14 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543305#comment-14543305
 ] 

Walter Su commented on HDFS-7621:
-

>...moveBlockAcrossStorage requires Mover send correct numOfBytes, otherwise DN 
>reject the command.
Problem solved.
Mover parses numBytes of internal block from ECSchema from HdfsFileStatus.
Balancer gets numBytes of internal block from NN which parsed using 
BlockInfoStriped.

> Erasure Coding: update the Balancer/Mover data migration logic
> --
>
> Key: HDFS-7621
> URL: https://issues.apache.org/jira/browse/HDFS-7621
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
>  Labels: HDFS-7285
> Attachments: HDFS-7621.001.patch, HDFS-7621.002.patch, 
> HDFS-7621.003.patch, HDFS-7621.004.patch
>
>
> Currently the Balancer/Mover only considers the distribution of replicas of 
> the same block during data migration: the migration cannot decrease the 
> number of racks. With EC the Balancer and Mover should also take into account 
> the distribution of blocks belonging to the same block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic

2015-05-13 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14543136#comment-14543136
 ] 

Walter Su commented on HDFS-7621:
-

Thanks [~zhz] for helpful comments.
>...convertToBlockWithLocations does similar things as the updated addBlock 
>method
clean code & add Javadoc. It's more clear. They're different logic.

>... I still think it's easier to just include an final int[] blockIndices
parsing at NN has smaller code. Also the changes to Dispatcher is small. 

>...nonCollocatedBlock only makes sense for striped block group
add javadoc to nonCollocatedBlock.
You can think of it as normal block need to avoid collocation. We can implement 
collocation the same way in the future if we have requirement.

>...why Balancer is using this BlocksWithLocations structure while Mover uses 
>LocatedBlocks
Balancer handles block and doesn't know file. Balancer gets blocks from Node.
Mover knows file. Mover gets blocks from file.

>...moveBlockAcrossStorage requires Mover send correct numOfBytes, otherwise DN 
>reject the command.
My concern is {{moveBlockAcrossStorage}}, HDFS-8289 can't solve the problem. 
Balancer doesn't know file. Is it necessary to check corruption when moving 
across storage inside DN?

> Erasure Coding: update the Balancer/Mover data migration logic
> --
>
> Key: HDFS-7621
> URL: https://issues.apache.org/jira/browse/HDFS-7621
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
>  Labels: HDFS-7285
> Attachments: HDFS-7621.001.patch, HDFS-7621.002.patch
>
>
> Currently the Balancer/Mover only considers the distribution of replicas of 
> the same block during data migration: the migration cannot decrease the 
> number of racks. With EC the Balancer and Mover should also take into account 
> the distribution of blocks belonging to the same block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic

2015-05-12 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14540932#comment-14540932
 ] 

Zhe Zhang commented on HDFS-7621:
-

bq. DBlock == reportedBlock. DBlock represents reportedBlock.
Good work Walter! After reading the patch in detail I understand your point 
now. Basically, for DN related tasks, NN should parse the block group before 
issuing the task. This is a good point and I agree with you. This is similar to 
how we are invalidating striped blocks now.

The changes on {{BlockManager}} look good to me. Only nit is that 
{{convertToBlockWithLocations}} does similar things as the updated {{addBlock}} 
method. We should either add some Javadoc for {{convertToBlockWithLocations}} 
or try to consolidate them into a single method which handles both striped and 
contiguous blocks. I'm +1 on committing this separately. This is a good 
improvement to NameNode {{getBlocks}} RPC call.

On {{Dispatcher}}:
# {{nonCollocatedBlock}} needs some Javadoc. 
# bq. I don't want to extend DBlock to support BlockGroup, It's confusing.
I understand this rationale but it seems we are already extending {{DBlock}} 
for striped block groups. {{nonCollocatedBlock}} only makes sense for striped 
block groups and it needs to be understood in the context of block striping. I 
still think it's easier to just include an {{final int[] blockIndices}} field 
in {{BlocksWithLocations}} so we can easily parse the returned arrays from NN. 

A related question I have is why {{Balancer}} is using this 
{{BlocksWithLocations}} structure while {{Mover}} uses {{LocatedBlocks}}, more 
like a client. Any chance we can consolidate them a little more? [~jingzhao] 
could you share more insights on this?

> Erasure Coding: update the Balancer/Mover data migration logic
> --
>
> Key: HDFS-7621
> URL: https://issues.apache.org/jira/browse/HDFS-7621
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
>  Labels: HDFS-7285
> Attachments: HDFS-7621.001.patch, HDFS-7621.002.patch
>
>
> Currently the Balancer/Mover only considers the distribution of replicas of 
> the same block during data migration: the migration cannot decrease the 
> number of racks. With EC the Balancer and Mover should also take into account 
> the distribution of blocks belonging to the same block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic

2015-05-06 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14531902#comment-14531902
 ] 

Walter Su commented on HDFS-7621:
-

{{moveBlockAcrossStorage}} requires Mover send correct {{numOfBytes}}, 
otherwise DN reject the command.
I'm thinking:
1. Is it necessary to check corruption when moving across storage inside DN?
2. If so, we wait HDFS-8289, so we get ECSchema from FileStatus, and calculate 
{{numOfBytes}} by calling StripedBlockUtil.constructInternalBlock(..)

> Erasure Coding: update the Balancer/Mover data migration logic
> --
>
> Key: HDFS-7621
> URL: https://issues.apache.org/jira/browse/HDFS-7621
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
>  Labels: HDFS-7285
> Attachments: HDFS-7621.001.patch, HDFS-7621.002.patch
>
>
> Currently the Balancer/Mover only considers the distribution of replicas of 
> the same block during data migration: the migration cannot decrease the 
> number of racks. With EC the Balancer and Mover should also take into account 
> the distribution of blocks belonging to the same block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic

2015-05-06 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14530154#comment-14530154
 ] 

Walter Su commented on HDFS-7621:
-

002 patch is ready for review.

> Erasure Coding: update the Balancer/Mover data migration logic
> --
>
> Key: HDFS-7621
> URL: https://issues.apache.org/jira/browse/HDFS-7621
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
> Attachments: HDFS-7621.001.patch, HDFS-7621.002.patch
>
>
> Currently the Balancer/Mover only considers the distribution of replicas of 
> the same block during data migration: the migration cannot decrease the 
> number of racks. With EC the Balancer and Mover should also take into account 
> the distribution of blocks belonging to the same block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic

2015-05-04 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14527825#comment-14527825
 ] 

Walter Su commented on HDFS-7621:
-

DN knows reportedBlock, doesn't know BlockGroup.
Balancer/Mover commnunicate DN directly. It's better Balancer/Mover schedule 
reportedBlock.
DBlock == reportedBlock. DBlock represents reportedBlock.
I don't want to extend DBlock to support BlockGroup, It's confusing.
DBlock.stripedBlock is DBlock type, But I treat it as a *normal* block, it's an 
associate block, it's a brother block. A DBlock should avoid co-located with 
its brother block.
Balancer/Mover doesn't know anything about stripe/blockGroup.

>... the current code should already be able to avoid placing 2 striped blocks 
>in the same group on the same node.
That's true. But how does Balancer send command to DN? With blockGroup Id or 
reportedBlock Id? Apparently It should be reportedBlock Id. Balancer/Mover need 
to figure out reportedBlock Id using indices[].  Mover knows indices[] because 
LocatedStripedBlock includes it. But Balancer doesn't know. We can make it know 
of couse. But it's complicated.


> Erasure Coding: update the Balancer/Mover data migration logic
> --
>
> Key: HDFS-7621
> URL: https://issues.apache.org/jira/browse/HDFS-7621
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
> Attachments: HDFS-7621.001.patch
>
>
> Currently the Balancer/Mover only considers the distribution of replicas of 
> the same block during data migration: the migration cannot decrease the 
> number of racks. With EC the Balancer and Mover should also take into account 
> the distribution of blocks belonging to the same block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic

2015-05-04 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14527189#comment-14527189
 ] 

Zhe Zhang commented on HDFS-7621:
-

Thanks for working on this Walter. 

bq. Balancer doesn't know EC Block group, it only know reported block( with 
real blockId in datanode).
I'm not so sure about this. NameNode should only track striped block groups 
instead of individual internal blocks. I also think you should be able to get 
all internal block location when creating {{MovedBlocks#Locations}}. If that's 
the case, the current code should already be able to avoid placing 2 striped 
blocks in the same group on the same node. Maybe we can add a test first and 
verify that from the debugging messages?

If the above is verified, the additional logic we need to take care of is to 
avoid violating the rack properties of a striped block groups.

> Erasure Coding: update the Balancer/Mover data migration logic
> --
>
> Key: HDFS-7621
> URL: https://issues.apache.org/jira/browse/HDFS-7621
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
> Attachments: HDFS-7621.001.patch
>
>
> Currently the Balancer/Mover only considers the distribution of replicas of 
> the same block during data migration: the migration cannot decrease the 
> number of racks. With EC the Balancer and Mover should also take into account 
> the distribution of blocks belonging to the same block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic

2015-04-21 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14506368#comment-14506368
 ] 

Walter Su commented on HDFS-7621:
-

001 patch is initial patch. No tests included. I want to know if the logic 
works.

What does the patch do:
1. {{Balancer.isGoodBlockCandidate(..)}} is a guard function. The patch make 
the function take EC block group into consideration.  It can avoid place 2 
block ( of BlockGroup ) in same node. {{reduceNumOfRacks(..)}} guard can avoid 
reduce number of racks after movement, so it tries not to violate (specific) 
placement policy but can't guarantee.

How does the patch work:
Balancer doesn't know EC Block group, it only know reported block( with real 
blockId in datanode). But I want it to take the locations of the group into 
consideration when choosing targets.
It requires the correctness of {{DBlock.isLocatedOn(..)}} HDFS-8147 .

> Erasure Coding: update the Balancer/Mover data migration logic
> --
>
> Key: HDFS-7621
> URL: https://issues.apache.org/jira/browse/HDFS-7621
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
> Attachments: HDFS-7621.001.patch
>
>
> Currently the Balancer/Mover only considers the distribution of replicas of 
> the same block during data migration: the migration cannot decrease the 
> number of racks. With EC the Balancer and Mover should also take into account 
> the distribution of blocks belonging to the same block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic

2015-04-21 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504588#comment-14504588
 ] 

Walter Su commented on HDFS-7621:
-

I'm still reading the code, and thinking how to do it. By the way, I found a 
bug in balancer. HDFS-8204

> Erasure Coding: update the Balancer/Mover data migration logic
> --
>
> Key: HDFS-7621
> URL: https://issues.apache.org/jira/browse/HDFS-7621
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
>
> Currently the Balancer/Mover only considers the distribution of replicas of 
> the same block during data migration: the migration cannot decrease the 
> number of racks. With EC the Balancer and Mover should also take into account 
> the distribution of blocks belonging to the same block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic

2015-04-17 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14500578#comment-14500578
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7621:
---

[~walter.k.su], are you working on this?  I believe so but just want to make 
sure.  BTW, this one has a higher priority than HDFS-7613 since this is in 
Phase 1.

> Erasure Coding: update the Balancer/Mover data migration logic
> --
>
> Key: HDFS-7621
> URL: https://issues.apache.org/jira/browse/HDFS-7621
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
>
> Currently the Balancer/Mover only considers the distribution of replicas of 
> the same block during data migration: the migration cannot decrease the 
> number of racks. With EC the Balancer and Mover should also take into account 
> the distribution of blocks belonging to the same block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic

2015-04-02 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394028#comment-14394028
 ] 

Jing Zhao commented on HDFS-7621:
-

Sure. Assign the jira to you. Thanks for working on this, Walter!

> Erasure Coding: update the Balancer/Mover data migration logic
> --
>
> Key: HDFS-7621
> URL: https://issues.apache.org/jira/browse/HDFS-7621
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Walter Su
>
> Currently the Balancer/Mover only considers the distribution of replicas of 
> the same block during data migration: the migration cannot decrease the 
> number of racks. With EC the Balancer and Mover should also take into account 
> the distribution of blocks belonging to the same block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7621) Erasure Coding: update the Balancer/Mover data migration logic

2015-04-02 Thread Walter Su (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14393882#comment-14393882
 ] 

Walter Su commented on HDFS-7621:
-

Hi, [~jingzhao]. I'm interested in this jira. I think it's related to HDFS-7613 
which I'm working on. Can you assign this jira to me? Or maybe you already have 
some progress?

> Erasure Coding: update the Balancer/Mover data migration logic
> --
>
> Key: HDFS-7621
> URL: https://issues.apache.org/jira/browse/HDFS-7621
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Jing Zhao
>
> Currently the Balancer/Mover only considers the distribution of replicas of 
> the same block during data migration: the migration cannot decrease the 
> number of racks. With EC the Balancer and Mover should also take into account 
> the distribution of blocks belonging to the same block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)