[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement

2015-09-04 Thread Surendra Singh Lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14730727#comment-14730727
 ] 

Surendra Singh Lilhore commented on HDFS-8946:
--

*One suggestion :*

In this jira some logs are removed.
1.
{code}
if (requiredSize > remaining - scheduledSize) {
  logNodeIsNotChosen(storage, "the node does not have enough "
  + storage.getStorageType() + " space"
  + " (required=" + requiredSize
  + ", scheduled=" + scheduledSize
  + ", remaining=" + remaining + ")");
  return false;
}
{code}

2. 
{code}
logNodeIsNotChosen(storage, "storage is read-only");
{code}

3.
{code}
logNodeIsNotChosen(storage, "storage has failed");
{code}

I think this is very important for debugging issue. From debug log we will get 
to know "why DN is not selected for block replication".

Can we add this back? At least first log should be there.

In HDFS-9023 some log improvement I want to do, when NN is not able to identify 
DN's for replication.


> Improve choosing datanode storage for block placement
> -
>
> Key: HDFS-8946
> URL: https://issues.apache.org/jira/browse/HDFS-8946
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch, 
> HDFS-8946.003.patch
>
>
> This JIRA is to:
> Improve chooseing datanode storage for block placement:
> In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, 
> {{chooseRandom}}), we have following logic to choose datanode storage to 
> place block.
> For given storage type, we iterate storages of the datanode. But for 
> datanode, it only cares about the storage type. In the loop, we check 
> according to Storage type and return the first storage if the storages of the 
> type on the datanode fit in requirement. So we can remove the iteration of 
> storages, and just need to do once to find a good storage of given type, it's 
> efficient if the storages of the type on the datanode don't fit in 
> requirement since we don't need to loop all storages and do the same check.
> Besides, no need to shuffle the storages, since we only need to check 
> according to the storage type on the datanode once.
> This also improves the logic and make it more clear.
> {code}
>   if (excludedNodes.add(localMachine) // was not in the excluded list
>   && isGoodDatanode(localDatanode, maxNodesPerRack, false,
>   results, avoidStaleNodes)) {
> for (Iterator> iter = storageTypes
> .entrySet().iterator(); iter.hasNext(); ) {
>   Map.Entry entry = iter.next();
>   for (DatanodeStorageInfo localStorage : DFSUtil.shuffle(
>   localDatanode.getStorageInfos())) {
> StorageType type = entry.getKey();
> if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize,
> results, type) >= 0) {
>   int num = entry.getValue();
>   ...
> {code}
> (current logic above)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement

2015-08-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724799#comment-14724799
 ] 

Hudson commented on HDFS-8946:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2255 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2255/])
HDFS-8946. Improve choosing datanode storage for block placement. (yliu) (yliu: 
rev 8fa41d9dd4b923bf4141f019414a1a8b079124c6)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java


> Improve choosing datanode storage for block placement
> -
>
> Key: HDFS-8946
> URL: https://issues.apache.org/jira/browse/HDFS-8946
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch, 
> HDFS-8946.003.patch
>
>
> This JIRA is to:
> Improve chooseing datanode storage for block placement:
> In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, 
> {{chooseRandom}}), we have following logic to choose datanode storage to 
> place block.
> For given storage type, we iterate storages of the datanode. But for 
> datanode, it only cares about the storage type. In the loop, we check 
> according to Storage type and return the first storage if the storages of the 
> type on the datanode fit in requirement. So we can remove the iteration of 
> storages, and just need to do once to find a good storage of given type, it's 
> efficient if the storages of the type on the datanode don't fit in 
> requirement since we don't need to loop all storages and do the same check.
> Besides, no need to shuffle the storages, since we only need to check 
> according to the storage type on the datanode once.
> This also improves the logic and make it more clear.
> {code}
>   if (excludedNodes.add(localMachine) // was not in the excluded list
>   && isGoodDatanode(localDatanode, maxNodesPerRack, false,
>   results, avoidStaleNodes)) {
> for (Iterator> iter = storageTypes
> .entrySet().iterator(); iter.hasNext(); ) {
>   Map.Entry entry = iter.next();
>   for (DatanodeStorageInfo localStorage : DFSUtil.shuffle(
>   localDatanode.getStorageInfos())) {
> StorageType type = entry.getKey();
> if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize,
> results, type) >= 0) {
>   int num = entry.getValue();
>   ...
> {code}
> (current logic above)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement

2015-08-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724785#comment-14724785
 ] 

Hudson commented on HDFS-8946:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #316 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/316/])
HDFS-8946. Improve choosing datanode storage for block placement. (yliu) (yliu: 
rev 8fa41d9dd4b923bf4141f019414a1a8b079124c6)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java


> Improve choosing datanode storage for block placement
> -
>
> Key: HDFS-8946
> URL: https://issues.apache.org/jira/browse/HDFS-8946
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch, 
> HDFS-8946.003.patch
>
>
> This JIRA is to:
> Improve chooseing datanode storage for block placement:
> In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, 
> {{chooseRandom}}), we have following logic to choose datanode storage to 
> place block.
> For given storage type, we iterate storages of the datanode. But for 
> datanode, it only cares about the storage type. In the loop, we check 
> according to Storage type and return the first storage if the storages of the 
> type on the datanode fit in requirement. So we can remove the iteration of 
> storages, and just need to do once to find a good storage of given type, it's 
> efficient if the storages of the type on the datanode don't fit in 
> requirement since we don't need to loop all storages and do the same check.
> Besides, no need to shuffle the storages, since we only need to check 
> according to the storage type on the datanode once.
> This also improves the logic and make it more clear.
> {code}
>   if (excludedNodes.add(localMachine) // was not in the excluded list
>   && isGoodDatanode(localDatanode, maxNodesPerRack, false,
>   results, avoidStaleNodes)) {
> for (Iterator> iter = storageTypes
> .entrySet().iterator(); iter.hasNext(); ) {
>   Map.Entry entry = iter.next();
>   for (DatanodeStorageInfo localStorage : DFSUtil.shuffle(
>   localDatanode.getStorageInfos())) {
> StorageType type = entry.getKey();
> if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize,
> results, type) >= 0) {
>   int num = entry.getValue();
>   ...
> {code}
> (current logic above)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement

2015-08-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724772#comment-14724772
 ] 

Hudson commented on HDFS-8946:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2274 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2274/])
HDFS-8946. Improve choosing datanode storage for block placement. (yliu) (yliu: 
rev 8fa41d9dd4b923bf4141f019414a1a8b079124c6)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Improve choosing datanode storage for block placement
> -
>
> Key: HDFS-8946
> URL: https://issues.apache.org/jira/browse/HDFS-8946
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch, 
> HDFS-8946.003.patch
>
>
> This JIRA is to:
> Improve chooseing datanode storage for block placement:
> In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, 
> {{chooseRandom}}), we have following logic to choose datanode storage to 
> place block.
> For given storage type, we iterate storages of the datanode. But for 
> datanode, it only cares about the storage type. In the loop, we check 
> according to Storage type and return the first storage if the storages of the 
> type on the datanode fit in requirement. So we can remove the iteration of 
> storages, and just need to do once to find a good storage of given type, it's 
> efficient if the storages of the type on the datanode don't fit in 
> requirement since we don't need to loop all storages and do the same check.
> Besides, no need to shuffle the storages, since we only need to check 
> according to the storage type on the datanode once.
> This also improves the logic and make it more clear.
> {code}
>   if (excludedNodes.add(localMachine) // was not in the excluded list
>   && isGoodDatanode(localDatanode, maxNodesPerRack, false,
>   results, avoidStaleNodes)) {
> for (Iterator> iter = storageTypes
> .entrySet().iterator(); iter.hasNext(); ) {
>   Map.Entry entry = iter.next();
>   for (DatanodeStorageInfo localStorage : DFSUtil.shuffle(
>   localDatanode.getStorageInfos())) {
> StorageType type = entry.getKey();
> if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize,
> results, type) >= 0) {
>   int num = entry.getValue();
>   ...
> {code}
> (current logic above)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement

2015-08-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724746#comment-14724746
 ] 

Hudson commented on HDFS-8946:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #332 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/332/])
HDFS-8946. Improve choosing datanode storage for block placement. (yliu) (yliu: 
rev 8fa41d9dd4b923bf4141f019414a1a8b079124c6)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java


> Improve choosing datanode storage for block placement
> -
>
> Key: HDFS-8946
> URL: https://issues.apache.org/jira/browse/HDFS-8946
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch, 
> HDFS-8946.003.patch
>
>
> This JIRA is to:
> Improve chooseing datanode storage for block placement:
> In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, 
> {{chooseRandom}}), we have following logic to choose datanode storage to 
> place block.
> For given storage type, we iterate storages of the datanode. But for 
> datanode, it only cares about the storage type. In the loop, we check 
> according to Storage type and return the first storage if the storages of the 
> type on the datanode fit in requirement. So we can remove the iteration of 
> storages, and just need to do once to find a good storage of given type, it's 
> efficient if the storages of the type on the datanode don't fit in 
> requirement since we don't need to loop all storages and do the same check.
> Besides, no need to shuffle the storages, since we only need to check 
> according to the storage type on the datanode once.
> This also improves the logic and make it more clear.
> {code}
>   if (excludedNodes.add(localMachine) // was not in the excluded list
>   && isGoodDatanode(localDatanode, maxNodesPerRack, false,
>   results, avoidStaleNodes)) {
> for (Iterator> iter = storageTypes
> .entrySet().iterator(); iter.hasNext(); ) {
>   Map.Entry entry = iter.next();
>   for (DatanodeStorageInfo localStorage : DFSUtil.shuffle(
>   localDatanode.getStorageInfos())) {
> StorageType type = entry.getKey();
> if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize,
> results, type) >= 0) {
>   int num = entry.getValue();
>   ...
> {code}
> (current logic above)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement

2015-08-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724742#comment-14724742
 ] 

Hudson commented on HDFS-8946:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #325 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/325/])
HDFS-8946. Improve choosing datanode storage for block placement. (yliu) (yliu: 
rev 8fa41d9dd4b923bf4141f019414a1a8b079124c6)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java


> Improve choosing datanode storage for block placement
> -
>
> Key: HDFS-8946
> URL: https://issues.apache.org/jira/browse/HDFS-8946
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch, 
> HDFS-8946.003.patch
>
>
> This JIRA is to:
> Improve chooseing datanode storage for block placement:
> In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, 
> {{chooseRandom}}), we have following logic to choose datanode storage to 
> place block.
> For given storage type, we iterate storages of the datanode. But for 
> datanode, it only cares about the storage type. In the loop, we check 
> according to Storage type and return the first storage if the storages of the 
> type on the datanode fit in requirement. So we can remove the iteration of 
> storages, and just need to do once to find a good storage of given type, it's 
> efficient if the storages of the type on the datanode don't fit in 
> requirement since we don't need to loop all storages and do the same check.
> Besides, no need to shuffle the storages, since we only need to check 
> according to the storage type on the datanode once.
> This also improves the logic and make it more clear.
> {code}
>   if (excludedNodes.add(localMachine) // was not in the excluded list
>   && isGoodDatanode(localDatanode, maxNodesPerRack, false,
>   results, avoidStaleNodes)) {
> for (Iterator> iter = storageTypes
> .entrySet().iterator(); iter.hasNext(); ) {
>   Map.Entry entry = iter.next();
>   for (DatanodeStorageInfo localStorage : DFSUtil.shuffle(
>   localDatanode.getStorageInfos())) {
> StorageType type = entry.getKey();
> if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize,
> results, type) >= 0) {
>   int num = entry.getValue();
>   ...
> {code}
> (current logic above)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement

2015-08-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724602#comment-14724602
 ] 

Hudson commented on HDFS-8946:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #1058 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/1058/])
HDFS-8946. Improve choosing datanode storage for block placement. (yliu) (yliu: 
rev 8fa41d9dd4b923bf4141f019414a1a8b079124c6)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java


> Improve choosing datanode storage for block placement
> -
>
> Key: HDFS-8946
> URL: https://issues.apache.org/jira/browse/HDFS-8946
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch, 
> HDFS-8946.003.patch
>
>
> This JIRA is to:
> Improve chooseing datanode storage for block placement:
> In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, 
> {{chooseRandom}}), we have following logic to choose datanode storage to 
> place block.
> For given storage type, we iterate storages of the datanode. But for 
> datanode, it only cares about the storage type. In the loop, we check 
> according to Storage type and return the first storage if the storages of the 
> type on the datanode fit in requirement. So we can remove the iteration of 
> storages, and just need to do once to find a good storage of given type, it's 
> efficient if the storages of the type on the datanode don't fit in 
> requirement since we don't need to loop all storages and do the same check.
> Besides, no need to shuffle the storages, since we only need to check 
> according to the storage type on the datanode once.
> This also improves the logic and make it more clear.
> {code}
>   if (excludedNodes.add(localMachine) // was not in the excluded list
>   && isGoodDatanode(localDatanode, maxNodesPerRack, false,
>   results, avoidStaleNodes)) {
> for (Iterator> iter = storageTypes
> .entrySet().iterator(); iter.hasNext(); ) {
>   Map.Entry entry = iter.next();
>   for (DatanodeStorageInfo localStorage : DFSUtil.shuffle(
>   localDatanode.getStorageInfos())) {
> StorageType type = entry.getKey();
> if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize,
> results, type) >= 0) {
>   int num = entry.getValue();
>   ...
> {code}
> (current logic above)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement

2015-08-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724528#comment-14724528
 ] 

Hudson commented on HDFS-8946:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8375 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8375/])
HDFS-8946. Improve choosing datanode storage for block placement. (yliu) (yliu: 
rev 8fa41d9dd4b923bf4141f019414a1a8b079124c6)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java


> Improve choosing datanode storage for block placement
> -
>
> Key: HDFS-8946
> URL: https://issues.apache.org/jira/browse/HDFS-8946
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Fix For: 2.8.0
>
> Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch, 
> HDFS-8946.003.patch
>
>
> This JIRA is to:
> Improve chooseing datanode storage for block placement:
> In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, 
> {{chooseRandom}}), we have following logic to choose datanode storage to 
> place block.
> For given storage type, we iterate storages of the datanode. But for 
> datanode, it only cares about the storage type. In the loop, we check 
> according to Storage type and return the first storage if the storages of the 
> type on the datanode fit in requirement. So we can remove the iteration of 
> storages, and just need to do once to find a good storage of given type, it's 
> efficient if the storages of the type on the datanode don't fit in 
> requirement since we don't need to loop all storages and do the same check.
> Besides, no need to shuffle the storages, since we only need to check 
> according to the storage type on the datanode once.
> This also improves the logic and make it more clear.
> {code}
>   if (excludedNodes.add(localMachine) // was not in the excluded list
>   && isGoodDatanode(localDatanode, maxNodesPerRack, false,
>   results, avoidStaleNodes)) {
> for (Iterator> iter = storageTypes
> .entrySet().iterator(); iter.hasNext(); ) {
>   Map.Entry entry = iter.next();
>   for (DatanodeStorageInfo localStorage : DFSUtil.shuffle(
>   localDatanode.getStorageInfos())) {
> StorageType type = entry.getKey();
> if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize,
> results, type) >= 0) {
>   int num = entry.getValue();
>   ...
> {code}
> (current logic above)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement

2015-08-31 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724483#comment-14724483
 ] 

Yi Liu commented on HDFS-8946:
--

Thanks a lot for the view, [~kihwal]!
Will commit shortly.

> Improve choosing datanode storage for block placement
> -
>
> Key: HDFS-8946
> URL: https://issues.apache.org/jira/browse/HDFS-8946
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch, 
> HDFS-8946.003.patch
>
>
> This JIRA is to:
> Improve chooseing datanode storage for block placement:
> In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, 
> {{chooseRandom}}), we have following logic to choose datanode storage to 
> place block.
> For given storage type, we iterate storages of the datanode. But for 
> datanode, it only cares about the storage type. In the loop, we check 
> according to Storage type and return the first storage if the storages of the 
> type on the datanode fit in requirement. So we can remove the iteration of 
> storages, and just need to do once to find a good storage of given type, it's 
> efficient if the storages of the type on the datanode don't fit in 
> requirement since we don't need to loop all storages and do the same check.
> Besides, no need to shuffle the storages, since we only need to check 
> according to the storage type on the datanode once.
> This also improves the logic and make it more clear.
> {code}
>   if (excludedNodes.add(localMachine) // was not in the excluded list
>   && isGoodDatanode(localDatanode, maxNodesPerRack, false,
>   results, avoidStaleNodes)) {
> for (Iterator> iter = storageTypes
> .entrySet().iterator(); iter.hasNext(); ) {
>   Map.Entry entry = iter.next();
>   for (DatanodeStorageInfo localStorage : DFSUtil.shuffle(
>   localDatanode.getStorageInfos())) {
> StorageType type = entry.getKey();
> if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize,
> results, type) >= 0) {
>   int num = entry.getValue();
>   ...
> {code}
> (current logic above)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement

2015-08-31 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14723589#comment-14723589
 ] 

Kihwal Lee commented on HDFS-8946:
--

+1 the patch looks good.

> Improve choosing datanode storage for block placement
> -
>
> Key: HDFS-8946
> URL: https://issues.apache.org/jira/browse/HDFS-8946
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch, 
> HDFS-8946.003.patch
>
>
> This JIRA is to:
> Improve chooseing datanode storage for block placement:
> In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, 
> {{chooseRandom}}), we have following logic to choose datanode storage to 
> place block.
> For given storage type, we iterate storages of the datanode. But for 
> datanode, it only cares about the storage type. In the loop, we check 
> according to Storage type and return the first storage if the storages of the 
> type on the datanode fit in requirement. So we can remove the iteration of 
> storages, and just need to do once to find a good storage of given type, it's 
> efficient if the storages of the type on the datanode don't fit in 
> requirement since we don't need to loop all storages and do the same check.
> Besides, no need to shuffle the storages, since we only need to check 
> according to the storage type on the datanode once.
> This also improves the logic and make it more clear.
> {code}
>   if (excludedNodes.add(localMachine) // was not in the excluded list
>   && isGoodDatanode(localDatanode, maxNodesPerRack, false,
>   results, avoidStaleNodes)) {
> for (Iterator> iter = storageTypes
> .entrySet().iterator(); iter.hasNext(); ) {
>   Map.Entry entry = iter.next();
>   for (DatanodeStorageInfo localStorage : DFSUtil.shuffle(
>   localDatanode.getStorageInfos())) {
> StorageType type = entry.getKey();
> if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize,
> results, type) >= 0) {
>   int num = entry.getValue();
>   ...
> {code}
> (current logic above)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement

2015-08-28 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14720899#comment-14720899
 ] 

Yi Liu commented on HDFS-8946:
--

The test failure is not related, they run successfully in my local env. 

> Improve choosing datanode storage for block placement
> -
>
> Key: HDFS-8946
> URL: https://issues.apache.org/jira/browse/HDFS-8946
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch, 
> HDFS-8946.003.patch
>
>
> This JIRA is to:
> Improve chooseing datanode storage for block placement:
> In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, 
> {{chooseRandom}}), we have following logic to choose datanode storage to 
> place block.
> For given storage type, we iterate storages of the datanode. But for 
> datanode, it only cares about the storage type. In the loop, we check 
> according to Storage type and return the first storage if the storages of the 
> type on the datanode fit in requirement. So we can remove the iteration of 
> storages, and just need to do once to find a good storage of given type, it's 
> efficient if the storages of the type on the datanode don't fit in 
> requirement since we don't need to loop all storages and do the same check.
> Besides, no need to shuffle the storages, since we only need to check 
> according to the storage type on the datanode once.
> This also improves the logic and make it more clear.
> {code}
>   if (excludedNodes.add(localMachine) // was not in the excluded list
>   && isGoodDatanode(localDatanode, maxNodesPerRack, false,
>   results, avoidStaleNodes)) {
> for (Iterator> iter = storageTypes
> .entrySet().iterator(); iter.hasNext(); ) {
>   Map.Entry entry = iter.next();
>   for (DatanodeStorageInfo localStorage : DFSUtil.shuffle(
>   localDatanode.getStorageInfos())) {
> StorageType type = entry.getKey();
> if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize,
> results, type) >= 0) {
>   int num = entry.getValue();
>   ...
> {code}
> (current logic above)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement

2015-08-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14719294#comment-14719294
 ] 

Hadoop QA commented on HDFS-8946:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  18m  5s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   8m  6s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 20s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 21s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 36s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 18s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 185m 32s | Tests failed in hadoop-hdfs. |
| | | 231m 51s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.blockmanagement.TestBlockManager |
|   | hadoop.hdfs.TestRollingUpgrade |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12752975/HDFS-8946.003.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / e166c03 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12194/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12194/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12194/console |


This message was automatically generated.

> Improve choosing datanode storage for block placement
> -
>
> Key: HDFS-8946
> URL: https://issues.apache.org/jira/browse/HDFS-8946
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch, 
> HDFS-8946.003.patch
>
>
> This JIRA is to:
> Improve chooseing datanode storage for block placement:
> In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, 
> {{chooseRandom}}), we have following logic to choose datanode storage to 
> place block.
> For given storage type, we iterate storages of the datanode. But for 
> datanode, it only cares about the storage type. In the loop, we check 
> according to Storage type and return the first storage if the storages of the 
> type on the datanode fit in requirement. So we can remove the iteration of 
> storages, and just need to do once to find a good storage of given type, it's 
> efficient if the storages of the type on the datanode don't fit in 
> requirement since we don't need to loop all storages and do the same check.
> Besides, no need to shuffle the storages, since we only need to check 
> according to the storage type on the datanode once.
> This also improves the logic and make it more clear.
> {code}
>   if (excludedNodes.add(localMachine) // was not in the excluded list
>   && isGoodDatanode(localDatanode, maxNodesPerRack, false,
>   results, avoidStaleNodes)) {
> for (Iterator> iter = storageTypes
> .entrySet().iterator(); iter.hasNext(); ) {
>   Map.Entry entry = iter.next();
>   for (DatanodeStorageInfo localStorage : DFSUtil.shuffle(
>   localDatanode.getStorageInfos())) {
> StorageType type = entry.getKey();
> if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize,
> results, type) >= 0) {
>   int num = entry.getValue();
>   ...
> {code}
> (current logic above)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement

2015-08-28 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14718234#comment-14718234
 ] 

Yi Liu commented on HDFS-8946:
--

Will update the patch later.

> Improve choosing datanode storage for block placement
> -
>
> Key: HDFS-8946
> URL: https://issues.apache.org/jira/browse/HDFS-8946
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch
>
>
> This JIRA is to:
> Improve chooseing datanode storage for block placement:
> In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, 
> {{chooseRandom}}), we have following logic to choose datanode storage to 
> place block.
> For given storage type, we iterate storages of the datanode. But for 
> datanode, it only cares about the storage type. In the loop, we check 
> according to Storage type and return the first storage if the storages of the 
> type on the datanode fit in requirement. So we can remove the iteration of 
> storages, and just need to do once to find a good storage of given type, it's 
> efficient if the storages of the type on the datanode don't fit in 
> requirement since we don't need to loop all storages and do the same check.
> Besides, no need to shuffle the storages, since we only need to check 
> according to the storage type on the datanode once.
> This also improves the logic and make it more clear.
> {code}
>   if (excludedNodes.add(localMachine) // was not in the excluded list
>   && isGoodDatanode(localDatanode, maxNodesPerRack, false,
>   results, avoidStaleNodes)) {
> for (Iterator> iter = storageTypes
> .entrySet().iterator(); iter.hasNext(); ) {
>   Map.Entry entry = iter.next();
>   for (DatanodeStorageInfo localStorage : DFSUtil.shuffle(
>   localDatanode.getStorageInfos())) {
> StorageType type = entry.getKey();
> if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize,
> results, type) >= 0) {
>   int num = entry.getValue();
>   ...
> {code}
> (current logic above)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement

2015-08-28 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14718232#comment-14718232
 ] 

Yi Liu commented on HDFS-8946:
--

Thanks Masatake for the comments.
{quote}
LocatedBlock returned by ClientProtocol#addBlock, 
ClientProtocol#getAdditionalDatanode and ClientProtocol#updateBlockForPipeline 
contains storageIDs given by BlockPlacementPolicy#chooseTarget but the user of 
these APIs (which is only DataStreamer) does not uses storageIDs. DataStreamer 
just send storage type to DataNode and the DataNode decides which volume to use 
on its own by using VolumeChoosingPolicy.
{quote}
Yes, {{storageIDs}} is not used. 

{quote}
Any reason to change the logic of remaining size checking?
{quote}
Nice find, It's my fault to forget to add {{blockSize *}} while coping this 
logic from original {{isGoodTarget}}

> Improve choosing datanode storage for block placement
> -
>
> Key: HDFS-8946
> URL: https://issues.apache.org/jira/browse/HDFS-8946
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch
>
>
> This JIRA is to:
> Improve chooseing datanode storage for block placement:
> In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, 
> {{chooseRandom}}), we have following logic to choose datanode storage to 
> place block.
> For given storage type, we iterate storages of the datanode. But for 
> datanode, it only cares about the storage type. In the loop, we check 
> according to Storage type and return the first storage if the storages of the 
> type on the datanode fit in requirement. So we can remove the iteration of 
> storages, and just need to do once to find a good storage of given type, it's 
> efficient if the storages of the type on the datanode don't fit in 
> requirement since we don't need to loop all storages and do the same check.
> Besides, no need to shuffle the storages, since we only need to check 
> according to the storage type on the datanode once.
> This also improves the logic and make it more clear.
> {code}
>   if (excludedNodes.add(localMachine) // was not in the excluded list
>   && isGoodDatanode(localDatanode, maxNodesPerRack, false,
>   results, avoidStaleNodes)) {
> for (Iterator> iter = storageTypes
> .entrySet().iterator(); iter.hasNext(); ) {
>   Map.Entry entry = iter.next();
>   for (DatanodeStorageInfo localStorage : DFSUtil.shuffle(
>   localDatanode.getStorageInfos())) {
> StorageType type = entry.getKey();
> if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize,
> results, type) >= 0) {
>   int num = entry.getValue();
>   ...
> {code}
> (current logic above)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement

2015-08-28 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14718226#comment-14718226
 ] 

Masatake Iwasaki commented on HDFS-8946:


DatanodeDescriptor.java
{code}
679 final long requiredSize =
680 blockSize * HdfsServerConstants.MIN_BLOCKS_FOR_WRITE;
681 final long scheduledSize = getBlocksScheduled(t);
682 long remaining = 0;
683 DatanodeStorageInfo storage = null;
684 for (DatanodeStorageInfo s : getStorageInfos()) {
685   if (s.getState() == State.NORMAL &&
686   s.getStorageType() == t) {
687 if (storage == null) {
688   storage = s;
689 }
690 long r = s.getRemaining();
691 if (r >= requiredSize) {
692   remaining += r;
693 }
694   }
695 }
696 if (requiredSize > remaining - scheduledSize) {
697   return null;
{code}

{{scheduledSize}} is number of blocks but used as if it's bytes.

Any reason to change the logic of remaining size checking?


> Improve choosing datanode storage for block placement
> -
>
> Key: HDFS-8946
> URL: https://issues.apache.org/jira/browse/HDFS-8946
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch
>
>
> This JIRA is to:
> Improve chooseing datanode storage for block placement:
> In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, 
> {{chooseRandom}}), we have following logic to choose datanode storage to 
> place block.
> For given storage type, we iterate storages of the datanode. But for 
> datanode, it only cares about the storage type. In the loop, we check 
> according to Storage type and return the first storage if the storages of the 
> type on the datanode fit in requirement. So we can remove the iteration of 
> storages, and just need to do once to find a good storage of given type, it's 
> efficient if the storages of the type on the datanode don't fit in 
> requirement since we don't need to loop all storages and do the same check.
> Besides, no need to shuffle the storages, since we only need to check 
> according to the storage type on the datanode once.
> This also improves the logic and make it more clear.
> {code}
>   if (excludedNodes.add(localMachine) // was not in the excluded list
>   && isGoodDatanode(localDatanode, maxNodesPerRack, false,
>   results, avoidStaleNodes)) {
> for (Iterator> iter = storageTypes
> .entrySet().iterator(); iter.hasNext(); ) {
>   Map.Entry entry = iter.next();
>   for (DatanodeStorageInfo localStorage : DFSUtil.shuffle(
>   localDatanode.getStorageInfos())) {
> StorageType type = entry.getKey();
> if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize,
> results, type) >= 0) {
>   int num = entry.getValue();
>   ...
> {code}
> (current logic above)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement

2015-08-28 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14718222#comment-14718222
 ] 

Masatake Iwasaki commented on HDFS-8946:


Thanks for working on this, [~Hitliuyi]. I read the code of 
BlockPlacementPolicyDefault for HDFS-8945 recently and comment here while my 
memory is fresh:-)

bq. Besides, no need to shuffle the storages, since we only need to check 
according to the storage type on the datanode once.

Here is my understanding of this. Please correct me if I'm wrong:
   
{{LocatedBlock}} returned by {{ClientProtocol#addBlock}}, 
{{ClientProtocol#getAdditionalDatanode}} and 
{{ClientProtocol#updateBlockForPipeline}} contains storageIDs given by 
{{BlockPlacementPolicy#chooseTarget}} but the user of these APIs (which is only 
DataStreamer) does not uses storageIDs. DataStreamer just send storage type to 
DataNode and the DataNode decides which volume to use on its own by using 
{{VolumeChoosingPolicy}}.


> Improve choosing datanode storage for block placement
> -
>
> Key: HDFS-8946
> URL: https://issues.apache.org/jira/browse/HDFS-8946
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch
>
>
> This JIRA is to:
> Improve chooseing datanode storage for block placement:
> In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, 
> {{chooseRandom}}), we have following logic to choose datanode storage to 
> place block.
> For given storage type, we iterate storages of the datanode. But for 
> datanode, it only cares about the storage type. In the loop, we check 
> according to Storage type and return the first storage if the storages of the 
> type on the datanode fit in requirement. So we can remove the iteration of 
> storages, and just need to do once to find a good storage of given type, it's 
> efficient if the storages of the type on the datanode don't fit in 
> requirement since we don't need to loop all storages and do the same check.
> Besides, no need to shuffle the storages, since we only need to check 
> according to the storage type on the datanode once.
> This also improves the logic and make it more clear.
> {code}
>   if (excludedNodes.add(localMachine) // was not in the excluded list
>   && isGoodDatanode(localDatanode, maxNodesPerRack, false,
>   results, avoidStaleNodes)) {
> for (Iterator> iter = storageTypes
> .entrySet().iterator(); iter.hasNext(); ) {
>   Map.Entry entry = iter.next();
>   for (DatanodeStorageInfo localStorage : DFSUtil.shuffle(
>   localDatanode.getStorageInfos())) {
> StorageType type = entry.getKey();
> if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize,
> results, type) >= 0) {
>   int num = entry.getValue();
>   ...
> {code}
> (current logic above)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement

2015-08-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717005#comment-14717005
 ] 

Hadoop QA commented on HDFS-8946:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  17m 24s | Pre-patch trunk has 4 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 40s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 52s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 19s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 27s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 26s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m  7s | Pre-build of native portion |
| {color:green}+1{color} | hdfs tests | 161m 10s | Tests passed in hadoop-hdfs. 
|
| | | 205m 25s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12752741/HDFS-8946.002.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 0bf2854 |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12162/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12162/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12162/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12162/console |


This message was automatically generated.

> Improve choosing datanode storage for block placement
> -
>
> Key: HDFS-8946
> URL: https://issues.apache.org/jira/browse/HDFS-8946
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch
>
>
> This JIRA is to:
> Improve chooseing datanode storage for block placement:
> In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, 
> {{chooseRandom}}), we have following logic to choose datanode storage to 
> place block.
> For given storage type, we iterate storages of the datanode. But for 
> datanode, it only cares about the storage type. In the loop, we check 
> according to Storage type and return the first storage if the storages of the 
> type on the datanode fit in requirement. So we can remove the iteration of 
> storages, and just need to do once to find a good storage of given type, it's 
> efficient if the storages of the type on the datanode don't fit in 
> requirement since we don't need to loop all storages and do the same check.
> Besides, no need to shuffle the storages, since we only need to check 
> according to the storage type on the datanode once.
> This also improves the logic and make it more clear.
> {code}
>   if (excludedNodes.add(localMachine) // was not in the excluded list
>   && isGoodDatanode(localDatanode, maxNodesPerRack, false,
>   results, avoidStaleNodes)) {
> for (Iterator> iter = storageTypes
> .entrySet().iterator(); iter.hasNext(); ) {
>   Map.Entry entry = iter.next();
>   for (DatanodeStorageInfo localStorage : DFSUtil.shuffle(
>   localDatanode.getStorageInfos())) {
> StorageType type = entry.getKey();
> if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize,
> results, type) >= 0) {
>   int num = entry.getValue();
>   ...
> {code}
> (current logic above)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement

2015-08-27 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14716632#comment-14716632
 ] 

Yi Liu commented on HDFS-8946:
--

Update the patch to fix test failure, the reason is: we can't remove 
addToExcludedNodes, since it's override by subclass.

> Improve choosing datanode storage for block placement
> -
>
> Key: HDFS-8946
> URL: https://issues.apache.org/jira/browse/HDFS-8946
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Attachments: HDFS-8946.001.patch, HDFS-8946.002.patch
>
>
> This JIRA is to:
> Improve chooseing datanode storage for block placement:
> In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, 
> {{chooseRandom}}), we have following logic to choose datanode storage to 
> place block.
> For given storage type, we iterate storages of the datanode. But for 
> datanode, it only cares about the storage type. In the loop, we check 
> according to Storage type and return the first storage if the storages of the 
> type on the datanode fit in requirement. So we can remove the iteration of 
> storages, and just need to do once to find a good storage of given type, it's 
> efficient if the storages of the type on the datanode don't fit in 
> requirement since we don't need to loop all storages and do the same check.
> Besides, no need to shuffle the storages, since we only need to check 
> according to the storage type on the datanode once.
> This also improves the logic and make it more clear.
> {code}
>   if (excludedNodes.add(localMachine) // was not in the excluded list
>   && isGoodDatanode(localDatanode, maxNodesPerRack, false,
>   results, avoidStaleNodes)) {
> for (Iterator> iter = storageTypes
> .entrySet().iterator(); iter.hasNext(); ) {
>   Map.Entry entry = iter.next();
>   for (DatanodeStorageInfo localStorage : DFSUtil.shuffle(
>   localDatanode.getStorageInfos())) {
> StorageType type = entry.getKey();
> if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize,
> results, type) >= 0) {
>   int num = entry.getValue();
>   ...
> {code}
> (current logic above)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement

2015-08-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14715156#comment-14715156
 ] 

Hadoop QA commented on HDFS-8946:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  17m 17s | Pre-patch trunk has 4 extant 
Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 56s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m 11s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   1m 21s | The applied patch generated  1 
new checkstyle issues (total was 85, now 83). |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 3  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 27s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 35s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 33s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 11s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 188m 13s | Tests failed in hadoop-hdfs. |
| | | 233m  9s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.TestReplaceDatanodeOnFailure |
|   | hadoop.hdfs.server.blockmanagement.TestHeartbeatHandling |
|   | hadoop.hdfs.server.namenode.TestFSNamesystem |
|   | hadoop.hdfs.TestRollingUpgrade |
|   | hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithNodeGroup |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12752472/HDFS-8946.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / a4d9acc |
| Pre-patch Findbugs warnings | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12132/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html
 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/12132/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12132/artifact/patchprocess/whitespace.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12132/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12132/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/12132/console |


This message was automatically generated.

> Improve choosing datanode storage for block placement
> -
>
> Key: HDFS-8946
> URL: https://issues.apache.org/jira/browse/HDFS-8946
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Attachments: HDFS-8946.001.patch
>
>
> This JIRA is to:
> *1.* Improve chooseing datanode storage for block placement:
> In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, 
> {{chooseRandom}}), we have following logic to choose datanode storage to 
> place block.
> For given storage type, we iterate storages of the datanode. But for 
> datanode, it only cares about the storage type. In the loop, we check 
> according to Storage type and return the first storage if the storages of the 
> type on the datanode fit in requirement. So we can remove the iteration of 
> storages, and just need to do once to find a good storage of given type, it's 
> efficient if the storages of the type on the datanode don't fit in 
> requirement since we don't need to loop all storages and do the same check.
> Besides, no need to shuffle the storages, since we only need to check 
> according to the storage type on the datanode once.
> This also improves the logic and make it more clear.
> {code}
>   if (excludedNodes.add(localMachine) // was not in the excluded list
>   && isGoodDatanode(localDatanode, maxNodesPerRack, false,
>   results, avoidStaleNodes)) {
> for (Iterator> iter = storageTypes
> .entrySet().iterator(); iter.hasNext(); ) {
>   Map.Entry entry = iter.next();
>   for (DatanodeStorageInfo localStorage : DFSUtil.shuffle(
>   

[jira] [Commented] (HDFS-8946) Improve choosing datanode storage for block placement

2015-08-26 Thread Yi Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14713409#comment-14713409
 ] 

Yi Liu commented on HDFS-8946:
--

# Add {{chooseStorage4Block}} to choose a good storage of given type from the 
datanode. It will checks the state of storage, and whether there is enough 
space to place the block on the datanode. These conditions are the same as 
original. Since the datanode only cares about the storage type when placing a 
block, this method will return the first storage of given type we see.  
Then {{isGoodTarget}} can be removed, since we check all the conditions in 
{{chooseStorage4Block}}.
# No need to shuffle the storages of datanode and no need to iterate all the 
storages of datanode in {{chooseLocalStorage}} and {{chooseRandom}}, since we 
can choose the storage of given type using {{chooseStorage4Block}} once.
# {{addToExcludedNodes}} is redundant after finding a good storage, since we 
already add the data node to excludedNodes at the begging of checking the node.
# {{numOfAvailableNodes \-= newExcludedNodes;}} in {{chooseRandom}} is 
redundant, since we already do {{numOfAvailableNodes--}} at the begging of 
checking the node.
# In {{DatanodeDescriptor#chooseStorage4Block}}, {{t == null}} is unnecessary, 
since it never be null and it's added in HDFS-8863.

> Improve choosing datanode storage for block placement
> -
>
> Key: HDFS-8946
> URL: https://issues.apache.org/jira/browse/HDFS-8946
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Yi Liu
> Attachments: HDFS-8946.001.patch
>
>
> This JIRA is to:
> *1.* Improve chooseing datanode storage for block placement:
> In {{BlockPlacementPolicyDefault}} ({{chooseLocalStorage}}, 
> {{chooseRandom}}), we have following logic to choose datanode storage to 
> place block.
> For given storage type, we iterate storages of the datanode. But for 
> datanode, it only cares about the storage type. In the loop, we check 
> according to Storage type and return the first storage if the storages of the 
> type on the datanode fit in requirement. So we can remove the iteration of 
> storages, and just need to do once to find a good storage of given type, it's 
> efficient if the storages of the type on the datanode don't fit in 
> requirement since we don't need to loop all storages and do the same check.
> Besides, no need to shuffle the storages, since we only need to check 
> according to the storage type on the datanode once.
> This also improves the logic and make it more clear.
> {code}
>   if (excludedNodes.add(localMachine) // was not in the excluded list
>   && isGoodDatanode(localDatanode, maxNodesPerRack, false,
>   results, avoidStaleNodes)) {
> for (Iterator> iter = storageTypes
> .entrySet().iterator(); iter.hasNext(); ) {
>   Map.Entry entry = iter.next();
>   for (DatanodeStorageInfo localStorage : DFSUtil.shuffle(
>   localDatanode.getStorageInfos())) {
> StorageType type = entry.getKey();
> if (addIfIsGoodTarget(localStorage, excludedNodes, blocksize,
> results, type) >= 0) {
>   int num = entry.getValue();
>   ...
> {code}
> (current logic above)
> *2.* Improve the logic and remove some duplicated code
> for example, In {{chooseLocalStorage}}, {{chooseRandom}}, we add the node to 
> excludeNodes before the {{for}}, and we do it again if we find it's a good 
> target. {{numOfAvailableNodes -= newExcludedNodes}} is duplicated too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)