[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2017-12-04 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16277759#comment-16277759
 ] 

genericqa commented on HDFS-10967:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  6s{color} 
| {color:red} HDFS-10967 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-10967 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12832548/HDFS-10967.03.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22279/console |
| Powered by | Apache Yetus 0.7.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
> ---
>
> Key: HDFS-10967
> URL: https://issues.apache.org/jira/browse/HDFS-10967
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>  Labels: balancer
> Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch, 
> HDFS-10967.02.patch, HDFS-10967.03.patch
>
>
> Large production clusters are likely to have heterogeneous nodes in terms of 
> storage capacity, memory, and CPU cores. It is not always possible to 
> proportionally ingest data into DataNodes based on their remaining storage 
> capacity. Therefore it's possible for a subset of DataNodes to be much closer 
> to full capacity than the rest.
> This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of 
> low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very 
> useful if we can lower the chance for those near-full DataNodes to become 
> destinations for the 2nd and 3rd replicas.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2017-07-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16104209#comment-16104209
 ] 

Hadoop QA commented on HDFS-10967:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  6s{color} 
| {color:red} HDFS-10967 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-10967 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12832548/HDFS-10967.03.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/20450/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
> ---
>
> Key: HDFS-10967
> URL: https://issues.apache.org/jira/browse/HDFS-10967
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>  Labels: balancer
> Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch, 
> HDFS-10967.02.patch, HDFS-10967.03.patch
>
>
> Large production clusters are likely to have heterogeneous nodes in terms of 
> storage capacity, memory, and CPU cores. It is not always possible to 
> proportionally ingest data into DataNodes based on their remaining storage 
> capacity. Therefore it's possible for a subset of DataNodes to be much closer 
> to full capacity than the rest.
> This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of 
> low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very 
> useful if we can lower the chance for those near-full DataNodes to become 
> destinations for the 2nd and 3rd replicas.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2016-10-13 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15573008#comment-15573008
 ] 

Konstantin Shvachko commented on HDFS-10967:


Sounds like HDFS-8131 already takes care of this issue.
# {{AvailableSpaceBlockPlacementPolicy}} chooses first replica locally
# It can be configured to prefer emptier nodes for other than the first replica 
(2nd, 3rd, etc.).
# If space usage on two nodes differs less than 5%, 
{{AvailableSpaceBlockPlacementPolicy}} treats them as equal size. 5% is 
hard-coded.
One generalization could be to factor the node size into the probability in the 
way that near-full nodes had very low chance of getting new replicas.
But this is probably a different issue, as no configuration change is needed.

> Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
> ---
>
> Key: HDFS-10967
> URL: https://issues.apache.org/jira/browse/HDFS-10967
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>  Labels: balancer
> Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch, 
> HDFS-10967.02.patch, HDFS-10967.03.patch
>
>
> Large production clusters are likely to have heterogeneous nodes in terms of 
> storage capacity, memory, and CPU cores. It is not always possible to 
> proportionally ingest data into DataNodes based on their remaining storage 
> capacity. Therefore it's possible for a subset of DataNodes to be much closer 
> to full capacity than the rest.
> This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of 
> low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very 
> useful if we can lower the chance for those near-full DataNodes to become 
> destinations for the 2nd and 3rd replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2016-10-11 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566891#comment-15566891
 ] 

Zhe Zhang commented on HDFS-10967:
--

Thanks for confirming the 3.0 compatibility concern Andrew.

If we merge the logic into BPPDefault, it should be off by default. Using a 
hard threshold indeed might cause performance issues. So my current thought is 
to merge the HDFS-8131 logic into BPPDefault, which is much less aggressive.

> Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
> ---
>
> Key: HDFS-10967
> URL: https://issues.apache.org/jira/browse/HDFS-10967
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>  Labels: balancer
> Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch, 
> HDFS-10967.02.patch, HDFS-10967.03.patch
>
>
> Large production clusters are likely to have heterogeneous nodes in terms of 
> storage capacity, memory, and CPU cores. It is not always possible to 
> proportionally ingest data into DataNodes based on their remaining storage 
> capacity. Therefore it's possible for a subset of DataNodes to be much closer 
> to full capacity than the rest.
> This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of 
> low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very 
> useful if we can lower the chance for those near-full DataNodes to become 
> destinations for the 2nd and 3rd replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2016-10-11 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1551#comment-1551
 ] 

Andrew Wang commented on HDFS-10967:


Thanks for the ping Zhe, I'm okay with changing the configuration of the 
available space policy in 3.0 during the alpha period.

I haven't looked at this patch, but have we looked at the potential performance 
impact, particularly if we combine with BPPDefault? Noting that on HDFS-8041, 
Kihwal resolved as WONTFIX because of possible performance issues, so there 
might be the same concern here.

> Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
> ---
>
> Key: HDFS-10967
> URL: https://issues.apache.org/jira/browse/HDFS-10967
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>  Labels: balancer
> Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch, 
> HDFS-10967.02.patch, HDFS-10967.03.patch
>
>
> Large production clusters are likely to have heterogeneous nodes in terms of 
> storage capacity, memory, and CPU cores. It is not always possible to 
> proportionally ingest data into DataNodes based on their remaining storage 
> capacity. Therefore it's possible for a subset of DataNodes to be much closer 
> to full capacity than the rest.
> This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of 
> low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very 
> useful if we can lower the chance for those near-full DataNodes to become 
> destinations for the 2nd and 3rd replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2016-10-11 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565729#comment-15565729
 ] 

Ming Ma commented on HDFS-10967:


Thanks [~zhz] for the summary!

* Good idea to merge AvailableSpaceBlockPlacementPolicy to 
BlockPlacementPolicyDefault as configuration. That will make it easier for 
people to experiment the feature without changing the block placement policy. 
Generally what we want is the ability to reuse or compose new policies based on 
existing block placement policies, as I mentioned in 
https://issues.apache.org/jira/browse/HDFS-7613?focusedCommentId=14629070=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14629070.
* I agree with what [~shv] that the admin and ClientProtocol changes seem 
unnecessary. We don't define specific commands for many other things, for 
example, changing the block placement policy. NN failover in the HA setting 
should be good enough.

> Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
> ---
>
> Key: HDFS-10967
> URL: https://issues.apache.org/jira/browse/HDFS-10967
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>  Labels: balancer
> Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch, 
> HDFS-10967.02.patch, HDFS-10967.03.patch
>
>
> Large production clusters are likely to have heterogeneous nodes in terms of 
> storage capacity, memory, and CPU cores. It is not always possible to 
> proportionally ingest data into DataNodes based on their remaining storage 
> capacity. Therefore it's possible for a subset of DataNodes to be much closer 
> to full capacity than the rest.
> This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of 
> low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very 
> useful if we can lower the chance for those near-full DataNodes to become 
> destinations for the 2nd and 3rd replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2016-10-11 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564741#comment-15564741
 ] 

Zhe Zhang commented on HDFS-10967:
--

Just realized HDFS-8131 achieves similar goal with a different approach. Will 
close this one after reviewing HDFS-8131 again.

> Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
> ---
>
> Key: HDFS-10967
> URL: https://issues.apache.org/jira/browse/HDFS-10967
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>  Labels: balancer
> Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch, 
> HDFS-10967.02.patch, HDFS-10967.03.patch
>
>
> Large production clusters are likely to have heterogeneous nodes in terms of 
> storage capacity, memory, and CPU cores. It is not always possible to 
> proportionally ingest data into DataNodes based on their remaining storage 
> capacity. Therefore it's possible for a subset of DataNodes to be much closer 
> to full capacity than the rest.
> This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of 
> low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very 
> useful if we can lower the chance for those near-full DataNodes to become 
> destinations for the 2nd and 3rd replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2016-10-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563969#comment-15563969
 ] 

Hadoop QA commented on HDFS-10967:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
7s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 43s{color} | {color:orange} hadoop-hdfs-project: The patch generated 14 new 
+ 1052 unchanged - 1 fixed = 1066 total (was 1053) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
55s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m  1s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 91m 48s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.tools.TestHdfsConfigFields |
| Timed out junit tests | org.apache.hadoop.hdfs.TestLeaseRecovery2 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10967 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12832548/HDFS-10967.03.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux 7bb128d54181 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 96b1266 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17091/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project.txt
 |
| unit | 

[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2016-10-10 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563882#comment-15563882
 ] 

Konstantin Shvachko commented on HDFS-10967:


This could be good optimization of placement policy for other than the first 
replicas when some nodes are close to full.
This should only be turned on though in heterogeneous environment. With 
homogeneous nodes where node usage is balanced this will be just a performance 
overhead.
Few suggestions on the patch
# Set the default for remaining capacity threshold that triggers new behavior 
to {{0.00}}. Homogeneous clusters should not require reconfiguration.
# The config variable is more like a "threshold" rather than a "factor" as in 
{{considerLoad}}. So may be call it 
{{dfs.namenode.replication.considerCapacity.threshold}}.
# I would suggest to use only one configuration variable. The Boolean 
{{considerCapacity}} is essentially redundant.
# Would be good to have a JavaDoc for the new config variable and for the 
method, where the feature is triggered.
# Did you try to add {{isNearFull()}} call into {{isGoodDatanode()}}? Then you 
will not need to retry 3 times.
# JavaDoc for the unit test would be also useful.

A separate discussion point whether we should add a separate admin command for 
every configuration nob we introduce? This doesn't feel right. May be we should 
have a command, say {{refreshConfiguration()}}, which re-reads config, updates 
variables and logs what it updated. The main problem with admin command is that 
it does not change configuration and when you restart the service or fail it 
over you loose the set value. So may be anything that is controlled by 
configuration should be updated through configuration and the 
{{refreshConfiguration()}} call. We could have merged this with 
{{setBalancerBandwidth()}} some time.
Alternatively, we can use failover to take effect of new configuration values. 
Then we don't need new admin commands at all.

> Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
> ---
>
> Key: HDFS-10967
> URL: https://issues.apache.org/jira/browse/HDFS-10967
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>  Labels: balancer
> Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch, 
> HDFS-10967.02.patch, HDFS-10967.03.patch
>
>
> Large production clusters are likely to have heterogeneous nodes in terms of 
> storage capacity, memory, and CPU cores. It is not always possible to 
> proportionally ingest data into DataNodes based on their remaining storage 
> capacity. Therefore it's possible for a subset of DataNodes to be much closer 
> to full capacity than the rest.
> This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of 
> low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very 
> useful if we can lower the chance for those near-full DataNodes to become 
> destinations for the 2nd and 3rd replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2016-10-10 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563835#comment-15563835
 ] 

Zhe Zhang commented on HDFS-10967:
--

I verified reported test failures and could only reproduce 
{{TestHdfsConfigFields}}. Once we agree on the overall structure I'll do the 
due diligence:
# Add the item to {{hdfs-default.xml}}
# Document the new dfsAdmin command
# Clear up the checkStyle warnings

> Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
> ---
>
> Key: HDFS-10967
> URL: https://issues.apache.org/jira/browse/HDFS-10967
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>  Labels: balancer
> Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch, 
> HDFS-10967.02.patch, HDFS-10967.03.patch
>
>
> Large production clusters are likely to have heterogeneous nodes in terms of 
> storage capacity, memory, and CPU cores. It is not always possible to 
> proportionally ingest data into DataNodes based on their remaining storage 
> capacity. Therefore it's possible for a subset of DataNodes to be much closer 
> to full capacity than the rest.
> This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of 
> low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very 
> useful if we can lower the chance for those near-full DataNodes to become 
> destinations for the 2nd and 3rd replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2016-10-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563810#comment-15563810
 ] 

Hadoop QA commented on HDFS-10967:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
38s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
6s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 39s{color} | {color:orange} hadoop-hdfs-project: The patch generated 14 new 
+ 1052 unchanged - 1 fixed = 1066 total (was 1053) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
56s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 20s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 96m 26s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA |
|   | hadoop.hdfs.server.namenode.TestReconstructStripedBlocks |
|   | hadoop.hdfs.server.datanode.TestLargeBlockReport |
|   | hadoop.tools.TestHdfsConfigFields |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10967 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12832542/HDFS-10967.02.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux a91a71c163fa 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / c874fa9 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 

[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2016-10-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563326#comment-15563326
 ] 

Hadoop QA commented on HDFS-10967:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 30s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 10 new + 452 unchanged - 0 fixed = 462 total (was 452) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 61m 27s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 83m 26s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestReconstructStripedBlocks 
|
|   | hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithUpgradeDomain 
|
|   | hadoop.tools.TestHdfsConfigFields |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10967 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12832518/HDFS-10967.01.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 0693f663ee00 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 
21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / cef61d5 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17084/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17084/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17084/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17084/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Add configuration for 

[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2016-10-10 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563120#comment-15563120
 ] 

Ming Ma commented on HDFS-10967:


Thanks [~zhz]. Indeed that is an issue when the cluster has heterogeneous 
nodes, and the rack-based assumption makes sense to me.

* For balancer or over replicated scenarios, {{chooseReplicasToDelete}} uses 
absolute free space, maybe that needs to be change to percentage based?
* What if we move this new policy to {{isGoodDatanode}}? It has several 
benefits:
** BlockPlacementPolicyRackFaultTolerant can uses it.
** Cover the case where the writer is outside of the cluster, thus the call 
path is chooseLocalRack -> chooseRandom.
* Typo below you meant this.considerCapacity.
{noformat}
this.considerLoad = conf.getBoolean(...);
{noformat}



> Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
> ---
>
> Key: HDFS-10967
> URL: https://issues.apache.org/jira/browse/HDFS-10967
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>  Labels: balancer
> Attachments: HDFS-10967.00.patch, HDFS-10967.01.patch
>
>
> Large production clusters are likely to have heterogeneous nodes in terms of 
> storage capacity, memory, and CPU cores. It is not always possible to 
> proportionally ingest data into DataNodes based on their remaining storage 
> capacity. Therefore it's possible for a subset of DataNodes to be much closer 
> to full capacity than the rest.
> This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of 
> low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very 
> useful if we can lower the chance for those near-full DataNodes to become 
> destinations for the 2nd and 3rd replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2016-10-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15556694#comment-15556694
 ] 

Hadoop QA commented on HDFS-10967:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 29s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 4 new + 433 unchanged - 0 fixed = 437 total (was 433) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 59m 23s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 78m 39s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestSafeMode |
|   | hadoop.tools.TestHdfsConfigFields |
|   | hadoop.hdfs.server.blockmanagement.TestReplicationPolicyConsiderLoad |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10967 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12832206/HDFS-10967.00.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 9656bba47972 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 
20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / e57fa81 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17061/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17061/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17061/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17061/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org 

[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2016-10-07 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15556406#comment-15556406
 ] 

Zhe Zhang commented on HDFS-10967:
--

[~mingma] [~kihwal] [~daryn] I'd like to hear your opinions based on operating 
large clusters (not sure if you have the heterogeneity issue though). In our 
case, the Balancer is essentially fighting with application data ingestion. 
Triaging the 2nd and 3rd replicas at ingestion time should be better than 
spending the bandwidth to move them later.

> Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
> ---
>
> Key: HDFS-10967
> URL: https://issues.apache.org/jira/browse/HDFS-10967
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>  Labels: balancer
> Attachments: HDFS-10967.00.patch, HDFS-10967.poc.patch
>
>
> Large production clusters are likely to have heterogeneous nodes in terms of 
> storage capacity, memory, and CPU cores. It is not always possible to 
> proportionally ingest data into DataNodes based on their remaining storage 
> capacity. Therefore it's possible for a subset of DataNodes to be much closer 
> to full capacity than the rest.
> This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of 
> low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very 
> useful if we can lower the chance for those near-full DataNodes to become 
> destinations for the 2nd and 3rd replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2016-10-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15553428#comment-15553428
 ] 

Hadoop QA commented on HDFS-10967:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 28s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 3 new + 433 unchanged - 0 fixed = 436 total (was 433) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 77m 12s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 96m 33s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.tools.TestHdfsConfigFields |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-10967 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12832034/HDFS-10967.poc.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 39bd620beae5 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 
20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 1d330fb |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17050/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17050/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17050/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17050/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Add configuration for BlockPlacementPolicy to avoid 

[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

2016-10-06 Thread Zhe Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15552810#comment-15552810
 ] 

Zhe Zhang commented on HDFS-10967:
--

I'd like to discuss a few design perspectives. In general, I think this 
proposal is related to the existing {{considerLoad}} config knob, and we should 
leverage experience about how {{considerLoad}} has been used in production.
# Shall we add the logic only to 2nd and 3rd replicas? {{considerLoad}} applies 
to all replicas (including the first/local one). But I'm not sure if we should 
sacrifice locality if a DataNode is near-full.
# Which capacity metric should be used? Absolute amount of 
{{DatanodeInfo#remaining}} (indicating how many more replicas it can store) or 
the percentage of usage?
# Based on the selected metric, if a DataNode is near-full, should we simply 
avoid it, like how {{isGoodDatanode}} treats {{considerLoad}}? Or shall we just 
"toss the coin again" trying to find an emptier node?

> Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
> ---
>
> Key: HDFS-10967
> URL: https://issues.apache.org/jira/browse/HDFS-10967
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
>  Labels: balancer
>
> Large production clusters are likely to have heterogeneous nodes in terms of 
> storage capacity, memory, and CPU cores. It is not always possible to 
> proportionally ingest data into DataNodes based on their remaining storage 
> capacity. Therefore it's possible for a subset of DataNodes to be much closer 
> to full capacity than the rest.
> This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of 
> low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very 
> useful if we can lower the chance for those near-full DataNodes to become 
> destinations for the 2nd and 3rd replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org