[jira] [Commented] (HDFS-9023) When NN is not able to identify DN for replication, reason behind it can be logged

2017-12-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305700#comment-16305700
 ] 

Hudson commented on HDFS-9023:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #13422 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13422/])
HDFS-9023. When NN is not able to identify DN for replication, reason (xiao: 
rev 5bf7e594d7d54e5295fe4240c3d60c08d4755ab7)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java


> When NN is not able to identify DN for replication, reason behind it can be 
> logged
> --
>
> Key: HDFS-9023
> URL: https://issues.apache.org/jira/browse/HDFS-9023
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, namenode
>Affects Versions: 2.7.1
>Reporter: Surendra Singh Lilhore
>Assignee: Xiao Chen
>Priority: Critical
> Attachments: HDFS-9023.01.patch, HDFS-9023.02.patch, 
> HDFS-9023.03.patch, HDFS-9023.branch-2.patch
>
>
> When NN is not able to identify DN for replication, reason behind it can be 
> logged (at least critical information why DNs not chosen like disk is full). 
> At present it is expected to enable debug log.
> For example the reason for below error looks like all 7 DNs are busy for data 
> writes. But at client or NN side no hint is given in the log message.
> {noformat}
> File /tmp/logs/spark/logs/application_1437051383180_0610/xyz-195_26009.tmp 
> could only be replicated to 0 nodes instead of minReplication (=1).  There 
> are 7 datanode(s) running and no node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1553)
>  
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9023) When NN is not able to identify DN for replication, reason behind it can be logged

2017-12-28 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305693#comment-16305693
 ] 

genericqa commented on HDFS-9023:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
18s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
17s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} branch-2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 41s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  1m 
25s{color} | {color:red} The patch generated 350 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}105m  0s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Unreaped Processes | hadoop-hdfs:20 |
| Timed out junit tests | org.apache.hadoop.hdfs.TestLeaseRecovery2 |
|   | org.apache.hadoop.hdfs.web.TestWebHdfsTokens |
|   | org.apache.hadoop.hdfs.TestDFSInotifyEventInputStream |
|   | org.apache.hadoop.hdfs.TestDatanodeLayoutUpgrade |
|   | org.apache.hadoop.hdfs.TestFileAppendRestart |
|   | org.apache.hadoop.hdfs.security.TestDelegationToken |
|   | org.apache.hadoop.hdfs.web.TestWebHdfsWithRestCsrfPreventionFilter |
|   | org.apache.hadoop.hdfs.TestDFSMkdirs |
|   | org.apache.hadoop.hdfs.TestDatanodeReport |
|   | org.apache.hadoop.hdfs.web.TestWebHDFS |
|   | org.apache.hadoop.hdfs.web.TestWebHDFSXAttr |
|   | org.apache.hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes |
|   | org.apache.hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs |
|   | org.apache.hadoop.hdfs.TestSnapshotCommands |
|   | org.apache.hadoop.hdfs.web.TestFSMainOperationsWebHdfs |
|   | org.apache.hadoop.hdfs.TestDistributedFileSystem |
|   | org.apache.hadoop.hdfs.web.TestWebHDFSForHA |
|   | org.apache.hadoop.hdfs.TestReplaceDatanodeFailureReplication |
|   | org.apache.hadoop.hdfs.TestDFSShell |
|   | org.apache.hadoop.hdfs.web.TestWebHDFSAcl |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:17213a0 |
| JIRA Issue | HDFS-9023 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12903933/HDFS-9023.branch-2.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| 

[jira] [Commented] (HDFS-9023) When NN is not able to identify DN for replication, reason behind it can be logged

2017-12-28 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305630#comment-16305630
 ] 

Xiao Chen commented on HDFS-9023:
-

Thanks for the review Surendra.

Findbugs / unittest failures are unrelated. This is logging change so no new 
unit test added. Attaching a branch-2 patch because lambda doesn't work with 
jdk7. Will commit after pre-commit comes back.


> When NN is not able to identify DN for replication, reason behind it can be 
> logged
> --
>
> Key: HDFS-9023
> URL: https://issues.apache.org/jira/browse/HDFS-9023
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, namenode
>Affects Versions: 2.7.1
>Reporter: Surendra Singh Lilhore
>Assignee: Xiao Chen
>Priority: Critical
> Attachments: HDFS-9023.01.patch, HDFS-9023.02.patch, 
> HDFS-9023.03.patch, HDFS-9023.branch-2.patch
>
>
> When NN is not able to identify DN for replication, reason behind it can be 
> logged (at least critical information why DNs not chosen like disk is full). 
> At present it is expected to enable debug log.
> For example the reason for below error looks like all 7 DNs are busy for data 
> writes. But at client or NN side no hint is given in the log message.
> {noformat}
> File /tmp/logs/spark/logs/application_1437051383180_0610/xyz-195_26009.tmp 
> could only be replicated to 0 nodes instead of minReplication (=1).  There 
> are 7 datanode(s) running and no node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1553)
>  
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9023) When NN is not able to identify DN for replication, reason behind it can be logged

2017-12-28 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305279#comment-16305279
 ] 

genericqa commented on HDFS-9023:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 29s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
52s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 55s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}112m 22s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}163m  4s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-9023 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12903826/HDFS-9023.03.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux cc1f2cc72332 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / d31c9d8 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22515/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22515/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22515/testReport/ |
| Max. process+thread count | 3375 (vs. ulimit of 5000) |
| modules | C: 

[jira] [Commented] (HDFS-9023) When NN is not able to identify DN for replication, reason behind it can be logged

2017-12-27 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305116#comment-16305116
 ] 

genericqa commented on HDFS-9023:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
24s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
24s{color} | {color:red} root in trunk failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
24s{color} | {color:red} hadoop-hdfs in trunk failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
25s{color} | {color:red} hadoop-hdfs in trunk failed. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
1m 10s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
24s{color} | {color:red} hadoop-hdfs in trunk failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
24s{color} | {color:red} hadoop-hdfs in trunk failed. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 55s{color} 
| {color:red} hadoop-hdfs-project_hadoop-hdfs generated 394 new + 0 unchanged - 
0 fixed = 394 total (was 0) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 38s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 68 new + 0 unchanged - 0 fixed = 68 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 59s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
51s{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}120m  8s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}143m  1s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | 
hadoop.hdfs.server.blockmanagement.TestReconstructStripedBlocksWithRackAwareness
 |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-9023 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12903826/HDFS-9023.03.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux a1cfa4d4dcae 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git 

[jira] [Commented] (HDFS-9023) When NN is not able to identify DN for replication, reason behind it can be logged

2017-12-27 Thread Surendra Singh Lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305054#comment-16305054
 ] 

Surendra Singh Lilhore commented on HDFS-9023:
--

Thanks [~xiaochen] for V3 patch.

+1 LGTM..

bq. So kept this as-is in patch 3. Please let me know if you feel otherwise.

Fine, no need to change.



> When NN is not able to identify DN for replication, reason behind it can be 
> logged
> --
>
> Key: HDFS-9023
> URL: https://issues.apache.org/jira/browse/HDFS-9023
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, namenode
>Affects Versions: 2.7.1
>Reporter: Surendra Singh Lilhore
>Assignee: Xiao Chen
>Priority: Critical
> Attachments: HDFS-9023.01.patch, HDFS-9023.02.patch, 
> HDFS-9023.03.patch
>
>
> When NN is not able to identify DN for replication, reason behind it can be 
> logged (at least critical information why DNs not chosen like disk is full). 
> At present it is expected to enable debug log.
> For example the reason for below error looks like all 7 DNs are busy for data 
> writes. But at client or NN side no hint is given in the log message.
> {noformat}
> File /tmp/logs/spark/logs/application_1437051383180_0610/xyz-195_26009.tmp 
> could only be replicated to 0 nodes instead of minReplication (=1).  There 
> are 7 datanode(s) running and no node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1553)
>  
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9023) When NN is not able to identify DN for replication, reason behind it can be logged

2017-12-24 Thread Surendra Singh Lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302920#comment-16302920
 ] 

Surendra Singh Lilhore commented on HDFS-9023:
--

Thanks [~xiaochen] for the patch. V2 patch almost looks good to me.
Some minor comment..
# No need to put {{if (LOG.isDebugEnabled() && builder != null)}} check in else 
part. Already {{LOG.isDebugEnabled()}} check done in first {{if}}. So it shoud 
be like this.
{code}
  if (LOG.isDebugEnabled() && builder != null) {
detail = builder.toString();
if (badTarget) {
  builder.setLength(0);
} else {
  if (detail.length() > 1) {
// only log if there's more than "[", which is always appended at
// the beginning of this method.
LOG.debug(builder.toString());
  }
  detail = "";
}
  }
{code}
# Just give the HashMap generic paramter here.
{code}+  HashMap reasonMap = CHOOSE_RANDOM_REASONS.get();{code}
# Log message should be {{warn}}
{code}+LOG.info("Not enough replicas was chosen. Reason:{}", 
reasonMap);{code}

> When NN is not able to identify DN for replication, reason behind it can be 
> logged
> --
>
> Key: HDFS-9023
> URL: https://issues.apache.org/jira/browse/HDFS-9023
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, namenode
>Affects Versions: 2.7.1
>Reporter: Surendra Singh Lilhore
>Assignee: Xiao Chen
>Priority: Critical
> Attachments: HDFS-9023.01.patch, HDFS-9023.02.patch
>
>
> When NN is not able to identify DN for replication, reason behind it can be 
> logged (at least critical information why DNs not chosen like disk is full). 
> At present it is expected to enable debug log.
> For example the reason for below error looks like all 7 DNs are busy for data 
> writes. But at client or NN side no hint is given in the log message.
> {noformat}
> File /tmp/logs/spark/logs/application_1437051383180_0610/xyz-195_26009.tmp 
> could only be replicated to 0 nodes instead of minReplication (=1).  There 
> are 7 datanode(s) running and no node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1553)
>  
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9023) When NN is not able to identify DN for replication, reason behind it can be logged

2017-12-20 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16299006#comment-16299006
 ] 

Xiao Chen commented on HDFS-9023:
-

Thanks a lot [~surendrasingh] for the review and reassignment.

I resolved HDFS-12726 as a dup since this jira is created earlier. Copying your 
comments over to continue here.

{quote}
# {{private static final ThreadLocal chooseRandomReasons}}
Here HashMap is a raw type. it should be parameterized.
e.x :
{code}
  private static final ThreadLocal> 
chooseRandomReasons = ThreadLocal
  .withInitial(() -> new HashMap<>());
{code}
# No need to log failure reason here.
{code}
  Map reasonMap = chooseRandomReasons.get();
  if (!reasonMap.isEmpty()) {
LOG.info("Not enough replicas was chosen. Reason:{}", reasonMap);
  }
  throw new NotEnoughReplicasException(detail);
{code} 
just append reason in {{detail}} string. {{NotEnoughReplicasException}} message 
will be logged in {{chooseTarget(..)}} method.
{code}
  if (LOG.isTraceEnabled()) {
LOG.trace(message, e);
  } else {
LOG.warn(message + " " + e.getMessage());
  }
{code}
# I feel {{NodeNotChosenReason#NO_GOOD_STORAGE}} should be 
{{NodeNotChosenReason#NOTE_ENOUGH_STORAGE_SPACE}}. This we are using when 
{{(requiredSize > remaining - scheduledSize)}}
{quote}
bq. 1.
good catch, fixed
bq. 2. append reason in detail string.
I though about this too, but did not change the behavior because It seems 
{{badTarget}}'s filtering logic is very explicit.
I git blamed to [the earliest available version on 
github|https://github.com/apache/hadoop/blob/a196766ea07775f18ded69bd9e8d239f8cfd3ccc/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/BlockPlacementPolicyDefault.java#L413],
 but the exception message filtering is still there. My guess would be we don't 
want a benign (or, non critical) exception to have a message that's too long. 
I'd like to keep the day 0 behavior to prevent surprise.
bq. 3. NOT_ENOUGH_STORAGE_SPACE
done. Added the storage type to the message to make it clear that the space is 
specific to the type.

patch 2 also addressed 2 other issues I found: 
- {{DatanodeDescriptor}}'s log added by HDFS-11494 should use the BPP logger. 
This is the original behavior before HDFS-8946. (This can't be changed to use 
{{BPPD#logNodeIsNotChosen}} because it may also be called by 
{{chooseLocalStorage}}, which doesn't go through the logging in {{chooseRandom}}
- only log when the stringbuilder has more than "[".

Some of the tests I ran to look at the logs are: 
{{TestErasureCodingMultipleRacks}}, {{TestDefaultBlockPlacementPolicy}}, 
{{TestNamenodeStorageDirectives}}.

> When NN is not able to identify DN for replication, reason behind it can be 
> logged
> --
>
> Key: HDFS-9023
> URL: https://issues.apache.org/jira/browse/HDFS-9023
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, namenode
>Affects Versions: 2.7.1
>Reporter: Surendra Singh Lilhore
>Assignee: Xiao Chen
>Priority: Critical
>
> When NN is not able to identify DN for replication, reason behind it can be 
> logged (at least critical information why DNs not chosen like disk is full). 
> At present it is expected to enable debug log.
> For example the reason for below error looks like all 7 DNs are busy for data 
> writes. But at client or NN side no hint is given in the log message.
> {noformat}
> File /tmp/logs/spark/logs/application_1437051383180_0610/xyz-195_26009.tmp 
> could only be replicated to 0 nodes instead of minReplication (=1).  There 
> are 7 datanode(s) running and no node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1553)
>  
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9023) When NN is not able to identify DN for replication, reason behind it can be logged

2017-12-20 Thread Surendra Singh Lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16298205#comment-16298205
 ] 

Surendra Singh Lilhore commented on HDFS-9023:
--

Thanks [~xiaochen] for looking in to this issue.

bq. Would you have time to take a look? 
Yeah. I will post my review comment in HDFS-12726

bq.  I can also post here and close HDFS-12726 as a dup if you don't mind.
Yes, you can attached your patch in this jira. I will assign this to you.

> When NN is not able to identify DN for replication, reason behind it can be 
> logged
> --
>
> Key: HDFS-9023
> URL: https://issues.apache.org/jira/browse/HDFS-9023
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, namenode
>Affects Versions: 2.7.1
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Critical
>
> When NN is not able to identify DN for replication, reason behind it can be 
> logged (at least critical information why DNs not chosen like disk is full). 
> At present it is expected to enable debug log.
> For example the reason for below error looks like all 7 DNs are busy for data 
> writes. But at client or NN side no hint is given in the log message.
> {noformat}
> File /tmp/logs/spark/logs/application_1437051383180_0610/xyz-195_26009.tmp 
> could only be replicated to 0 nodes instead of minReplication (=1).  There 
> are 7 datanode(s) running and no node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1553)
>  
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9023) When NN is not able to identify DN for replication, reason behind it can be logged

2017-12-19 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16297535#comment-16297535
 ] 

Xiao Chen commented on HDFS-9023:
-

Hi [~surendrasingh],

Thanks for creating the jira and express your thoughts.

I ran into the same (lack of) logging issue while debugging some EC stuff, and 
did a patch locally. Git blaming to HDFS-8946, I saw your comment there and 
followed the trace to this jira. I like your idea about extra info, and 
implemented it in HDFS-12726. Would you have time to take a look? I can also 
post here and close HDFS-12726 as a dup if you don't mind. Sorry I didn't find 
this jira earlier.

> When NN is not able to identify DN for replication, reason behind it can be 
> logged
> --
>
> Key: HDFS-9023
> URL: https://issues.apache.org/jira/browse/HDFS-9023
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, namenode
>Affects Versions: 2.7.1
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Critical
>
> When NN is not able to identify DN for replication, reason behind it can be 
> logged (at least critical information why DNs not chosen like disk is full). 
> At present it is expected to enable debug log.
> For example the reason for below error looks like all 7 DNs are busy for data 
> writes. But at client or NN side no hint is given in the log message.
> {noformat}
> File /tmp/logs/spark/logs/application_1437051383180_0610/xyz-195_26009.tmp 
> could only be replicated to 0 nodes instead of minReplication (=1).  There 
> are 7 datanode(s) running and no node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1553)
>  
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9023) When NN is not able to identify DN for replication, reason behind it can be logged

2015-09-04 Thread Surendra Singh Lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14730719#comment-14730719
 ] 

Surendra Singh Lilhore commented on HDFS-9023:
--

In log we can give some extra info to client like {{READ_ONLY=10, NO_SPACE=5, 
FAILED=4}} OR {{All required storage types are unavailable.}}

> When NN is not able to identify DN for replication, reason behind it can be 
> logged
> --
>
> Key: HDFS-9023
> URL: https://issues.apache.org/jira/browse/HDFS-9023
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, namenode
>Affects Versions: 2.7.1
>Reporter: Surendra Singh Lilhore
>Assignee: Surendra Singh Lilhore
>Priority: Critical
>
> When NN is not able to identify DN for replication, reason behind it can be 
> logged (at least critical information why DNs not chosen like disk is full). 
> At present it is expected to enable debug log.
> For example the reason for below error looks like all 7 DNs are busy for data 
> writes. But at client or NN side no hint is given in the log message.
> {noformat}
> File /tmp/logs/spark/logs/application_1437051383180_0610/xyz-195_26009.tmp 
> could only be replicated to 0 nodes instead of minReplication (=1).  There 
> are 7 datanode(s) running and no node(s) are excluded in this operation.
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1553)
>  
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)