Auto-Re: [jira] [Commented] (HDFS-8586) Dead Datanode is allocated for write when client is from deadnode

2015-06-29 Thread wsb
您的邮件已收到!谢谢!

Auto-Re: [jira] [Commented] (HDFS-8586) Dead Datanode is allocated for write when client is from deadnode

2015-06-29 Thread wsb
您的邮件已收到!谢谢!

Auto-Re: [jira] [Commented] (HDFS-8586) Dead Datanode is allocated for write when client is from deadnode

2015-06-29 Thread wsb
您的邮件已收到!谢谢!

Auto-Re: [jira] [Commented] (HDFS-8586) Dead Datanode is allocated for write when client is from deadnode

2015-06-29 Thread wsb
您的邮件已收到!谢谢!

Auto-Re: [jira] [Commented] (HDFS-8586) Dead Datanode is allocated for write when client is from deadnode

2015-06-29 Thread wsb
您的邮件已收到!谢谢!

Auto-Re: [jira] [Commented] (HDFS-8586) Dead Datanode is allocated for write when client is from deadnode

2015-06-29 Thread wsb
您的邮件已收到!谢谢!

Auto-Re: [jira] [Commented] (HDFS-8586) Dead Datanode is allocated for write when client is from deadnode

2015-06-29 Thread wsb
您的邮件已收到!谢谢!

Auto-Re: [jira] [Commented] (HDFS-8586) Dead Datanode is allocated for write when client is from deadnode

2015-06-29 Thread wsb
您的邮件已收到!谢谢!

Auto-Re: [jira] [Commented] (HDFS-8586) Dead Datanode is allocated for write when client is from deadnode

2015-06-29 Thread wsb
您的邮件已收到!谢谢!

[jira] [Commented] (HDFS-8586) Dead Datanode is allocated for write when client is from deadnode

2015-06-28 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14605080#comment-14605080
 ] 

Brahma Reddy Battula commented on HDFS-8586:


[~vinayrpet] kindly review the attached patch!!! thanks

 Dead Datanode is allocated for write when client is  from deadnode
 --

 Key: HDFS-8586
 URL: https://issues.apache.org/jira/browse/HDFS-8586
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
Priority: Critical
 Attachments: HDFS-8586.patch


  *{color:blue}DataNode marked as Dead{color}* 
 2015-06-11 19:39:00,862 | INFO  | 
 org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager$Monitor@28ec166e
  | BLOCK*  *removeDeadDatanode: lost heartbeat from XX.XX.39.33:25009*  | 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.removeDeadDatanode(DatanodeManager.java:584)
 2015-06-11 19:39:00,863 | INFO  | 
 org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager$Monitor@28ec166e
  | Removing a node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.remove(NetworkTopology.java:488)
   *{color:blue}Deadnode got Allocated{color}* 
 2015-06-11 19:39:45,148 | WARN  | IPC Server handler 26 on 25000 | The 
 cluster does not contain node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.getDistance(NetworkTopology.java:616)
 2015-06-11 19:39:45,149 | WARN  | IPC Server handler 26 on 25000 | The 
 cluster does not contain node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.getDistance(NetworkTopology.java:616)
 2015-06-11 19:39:45,149 | WARN  | IPC Server handler 26 on 25000 | The 
 cluster does not contain node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.getDistance(NetworkTopology.java:616)
 2015-06-11 19:39:45,149 | WARN  | IPC Server handler 26 on 25000 | The 
 cluster does not contain node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.getDistance(NetworkTopology.java:616)
 2015-06-11 19:39:45,149 | INFO  | IPC Server handler 26 on 25000 | BLOCK*  
 *allocate blk_1073754030_13252* {UCState=UNDER_CONSTRUCTION, 
 truncateBlock=null, primaryNodeIndex=-1, 
 replicas=[ReplicaUC[[DISK]DS-e8d29773-dfc2-4224-b1d6-9b0588bca55e:NORMAL:{color:red}XX.XX.39.33:25009{color}|RBW],
   
 ReplicaUC[[DISK]DS-f7d2ab3c-88f7-470c-9097-84387c0bec83:NORMAL:XX.XX.38.32:25009|RBW],
  ReplicaUC[[DISK]DS-8c2a464a-ac81-4651-890a-dbfd07ddd95f:NORMAL: 
 *XX.XX.38.33:25009|RBW]]* } for /t1._COPYING_ | 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveAllocatedBlock(FSNamesystem.java:3657)
 2015-06-11 19:39:45,191 | INFO  | IPC Server handler 35 on 25000 | BLOCK* 
 allocate blk_1073754031_13253{UCState=UNDER_CONSTRUCTION, truncateBlock=null, 
 primaryNodeIndex=-1, 
 replicas=[ReplicaUC[[DISK]DS-ed8ad579-50c0-4e3e-8780-9776531763b6:NORMAL:XX.XX.39.31:25009|RBW],
  
 ReplicaUC[[DISK]DS-19ddd6da-4a3e-481a-8445-dde5c90aaff3:NORMAL:XX.XX.37.32:25009|RBW],
  ReplicaUC[[DISK]DS-4ce4ce39-4973-42ce-8c7d-cb41f899db85: 
 {{NORMAL:XX.XX.37.33:25009}}   |RBW]]} for /t1._COPYING_ | 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveAllocatedBlock(FSNamesystem.java:3657)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Auto-Re: [jira] [Commented] (HDFS-8586) Dead Datanode is allocated for write when client is from deadnode

2015-06-28 Thread wsb
您的邮件已收到!谢谢!

[jira] [Commented] (HDFS-8586) Dead Datanode is allocated for write when client is from deadnode

2015-06-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597398#comment-14597398
 ] 

Hadoop QA commented on HDFS-8586:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  18m  3s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 39s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 48s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m 18s | There were no new checkstyle 
issues. |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m 17s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 17s | Pre-build of native portion |
| {color:green}+1{color} | hdfs tests | 158m 22s | Tests passed in hadoop-hdfs. 
|
| | | 205m 15s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12741218/HDFS-8586.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 41ae776 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11444/artifact/patchprocess/whitespace.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11444/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11444/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11444/console |


This message was automatically generated.

 Dead Datanode is allocated for write when client is  from deadnode
 --

 Key: HDFS-8586
 URL: https://issues.apache.org/jira/browse/HDFS-8586
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
Priority: Critical
 Attachments: HDFS-8586.patch


  *{color:blue}DataNode marked as Dead{color}* 
 2015-06-11 19:39:00,862 | INFO  | 
 org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager$Monitor@28ec166e
  | BLOCK*  *removeDeadDatanode: lost heartbeat from XX.XX.39.33:25009*  | 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.removeDeadDatanode(DatanodeManager.java:584)
 2015-06-11 19:39:00,863 | INFO  | 
 org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager$Monitor@28ec166e
  | Removing a node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.remove(NetworkTopology.java:488)
   *{color:blue}Deadnode got Allocated{color}* 
 2015-06-11 19:39:45,148 | WARN  | IPC Server handler 26 on 25000 | The 
 cluster does not contain node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.getDistance(NetworkTopology.java:616)
 2015-06-11 19:39:45,149 | WARN  | IPC Server handler 26 on 25000 | The 
 cluster does not contain node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.getDistance(NetworkTopology.java:616)
 2015-06-11 19:39:45,149 | WARN  | IPC Server handler 26 on 25000 | The 
 cluster does not contain node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.getDistance(NetworkTopology.java:616)
 2015-06-11 19:39:45,149 | WARN  | IPC Server handler 26 on 25000 | The 
 cluster does not contain node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.getDistance(NetworkTopology.java:616)
 2015-06-11 19:39:45,149 | INFO  | IPC Server handler 26 on 25000 | BLOCK*  
 *allocate blk_1073754030_13252* {UCState=UNDER_CONSTRUCTION, 
 truncateBlock=null, primaryNodeIndex=-1, 
 replicas=[ReplicaUC[[DISK]DS-e8d29773-dfc2-4224-b1d6-9b0588bca55e:NORMAL:{color:red}XX.XX.39.33:25009{color}|RBW],
   
 ReplicaUC[[DISK]DS-f7d2ab3c-88f7-470c-9097-84387c0bec83:NORMAL:XX.XX.38.32:25009|RBW],
  ReplicaUC[[DISK]DS-8c2a464a-ac81-4651-890a-dbfd07ddd95f:NORMAL: 
 *XX.XX.38.33:25009|RBW]]* } for /t1._COPYING_ | 

[jira] [Commented] (HDFS-8586) Dead Datanode is allocated for write when client is from deadnode

2015-06-22 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597156#comment-14597156
 ] 

Vinayakumar B commented on HDFS-8586:
-

Thanks [~brahmareddy] for reporting this.
This will come, if the NameNode have the list of deadnodes, and block 
allocation request comes from the same machine as of DeadNode, then dead node 
is being chosen as localnode irrespective of whether its part of the cluster or 
not. Adding one check in 
{{BlockPlacementPolicyDefault.java#choseLocalStorage(..)}} will be the fix for 
this.

Regarding the test proposed above, it will not fail always, since its a 
minidfscluster test, and all datanodes will be on the same machine And 
Probabiity of deadnode being chosen as localstorage is not guaranteed.

 Dead Datanode is allocated for write when client is  from deadnode
 --

 Key: HDFS-8586
 URL: https://issues.apache.org/jira/browse/HDFS-8586
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
Priority: Critical

  *{color:blue}DataNode marked as Dead{color}* 
 2015-06-11 19:39:00,862 | INFO  | 
 org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager$Monitor@28ec166e
  | BLOCK*  *removeDeadDatanode: lost heartbeat from XX.XX.39.33:25009*  | 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.removeDeadDatanode(DatanodeManager.java:584)
 2015-06-11 19:39:00,863 | INFO  | 
 org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager$Monitor@28ec166e
  | Removing a node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.remove(NetworkTopology.java:488)
   *{color:blue}Deadnode got Allocated{color}* 
 2015-06-11 19:39:45,148 | WARN  | IPC Server handler 26 on 25000 | The 
 cluster does not contain node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.getDistance(NetworkTopology.java:616)
 2015-06-11 19:39:45,149 | WARN  | IPC Server handler 26 on 25000 | The 
 cluster does not contain node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.getDistance(NetworkTopology.java:616)
 2015-06-11 19:39:45,149 | WARN  | IPC Server handler 26 on 25000 | The 
 cluster does not contain node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.getDistance(NetworkTopology.java:616)
 2015-06-11 19:39:45,149 | WARN  | IPC Server handler 26 on 25000 | The 
 cluster does not contain node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.getDistance(NetworkTopology.java:616)
 2015-06-11 19:39:45,149 | INFO  | IPC Server handler 26 on 25000 | BLOCK*  
 *allocate blk_1073754030_13252* {UCState=UNDER_CONSTRUCTION, 
 truncateBlock=null, primaryNodeIndex=-1, 
 replicas=[ReplicaUC[[DISK]DS-e8d29773-dfc2-4224-b1d6-9b0588bca55e:NORMAL:{color:red}XX.XX.39.33:25009{color}|RBW],
   
 ReplicaUC[[DISK]DS-f7d2ab3c-88f7-470c-9097-84387c0bec83:NORMAL:XX.XX.38.32:25009|RBW],
  ReplicaUC[[DISK]DS-8c2a464a-ac81-4651-890a-dbfd07ddd95f:NORMAL: 
 *XX.XX.38.33:25009|RBW]]* } for /t1._COPYING_ | 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveAllocatedBlock(FSNamesystem.java:3657)
 2015-06-11 19:39:45,191 | INFO  | IPC Server handler 35 on 25000 | BLOCK* 
 allocate blk_1073754031_13253{UCState=UNDER_CONSTRUCTION, truncateBlock=null, 
 primaryNodeIndex=-1, 
 replicas=[ReplicaUC[[DISK]DS-ed8ad579-50c0-4e3e-8780-9776531763b6:NORMAL:XX.XX.39.31:25009|RBW],
  
 ReplicaUC[[DISK]DS-19ddd6da-4a3e-481a-8445-dde5c90aaff3:NORMAL:XX.XX.37.32:25009|RBW],
  ReplicaUC[[DISK]DS-4ce4ce39-4973-42ce-8c7d-cb41f899db85: 
 {{NORMAL:XX.XX.37.33:25009}}   |RBW]]} for /t1._COPYING_ | 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveAllocatedBlock(FSNamesystem.java:3657)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8586) Dead Datanode is allocated for write when client is from deadnode

2015-06-22 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597196#comment-14597196
 ] 

Brahma Reddy Battula commented on HDFS-8586:


[~vinayrpet] thanks a lot for taking a look into this issue.. Added the one 
check in {{BlockPlacementPolicyDefault.java#choseLocalStorage(..)}} and 
corrected the testcase.. Kindly Review

 Dead Datanode is allocated for write when client is  from deadnode
 --

 Key: HDFS-8586
 URL: https://issues.apache.org/jira/browse/HDFS-8586
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
Priority: Critical
 Attachments: HDFS-8586.patch


  *{color:blue}DataNode marked as Dead{color}* 
 2015-06-11 19:39:00,862 | INFO  | 
 org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager$Monitor@28ec166e
  | BLOCK*  *removeDeadDatanode: lost heartbeat from XX.XX.39.33:25009*  | 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.removeDeadDatanode(DatanodeManager.java:584)
 2015-06-11 19:39:00,863 | INFO  | 
 org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager$Monitor@28ec166e
  | Removing a node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.remove(NetworkTopology.java:488)
   *{color:blue}Deadnode got Allocated{color}* 
 2015-06-11 19:39:45,148 | WARN  | IPC Server handler 26 on 25000 | The 
 cluster does not contain node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.getDistance(NetworkTopology.java:616)
 2015-06-11 19:39:45,149 | WARN  | IPC Server handler 26 on 25000 | The 
 cluster does not contain node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.getDistance(NetworkTopology.java:616)
 2015-06-11 19:39:45,149 | WARN  | IPC Server handler 26 on 25000 | The 
 cluster does not contain node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.getDistance(NetworkTopology.java:616)
 2015-06-11 19:39:45,149 | WARN  | IPC Server handler 26 on 25000 | The 
 cluster does not contain node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.getDistance(NetworkTopology.java:616)
 2015-06-11 19:39:45,149 | INFO  | IPC Server handler 26 on 25000 | BLOCK*  
 *allocate blk_1073754030_13252* {UCState=UNDER_CONSTRUCTION, 
 truncateBlock=null, primaryNodeIndex=-1, 
 replicas=[ReplicaUC[[DISK]DS-e8d29773-dfc2-4224-b1d6-9b0588bca55e:NORMAL:{color:red}XX.XX.39.33:25009{color}|RBW],
   
 ReplicaUC[[DISK]DS-f7d2ab3c-88f7-470c-9097-84387c0bec83:NORMAL:XX.XX.38.32:25009|RBW],
  ReplicaUC[[DISK]DS-8c2a464a-ac81-4651-890a-dbfd07ddd95f:NORMAL: 
 *XX.XX.38.33:25009|RBW]]* } for /t1._COPYING_ | 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveAllocatedBlock(FSNamesystem.java:3657)
 2015-06-11 19:39:45,191 | INFO  | IPC Server handler 35 on 25000 | BLOCK* 
 allocate blk_1073754031_13253{UCState=UNDER_CONSTRUCTION, truncateBlock=null, 
 primaryNodeIndex=-1, 
 replicas=[ReplicaUC[[DISK]DS-ed8ad579-50c0-4e3e-8780-9776531763b6:NORMAL:XX.XX.39.31:25009|RBW],
  
 ReplicaUC[[DISK]DS-19ddd6da-4a3e-481a-8445-dde5c90aaff3:NORMAL:XX.XX.37.32:25009|RBW],
  ReplicaUC[[DISK]DS-4ce4ce39-4973-42ce-8c7d-cb41f899db85: 
 {{NORMAL:XX.XX.37.33:25009}}   |RBW]]} for /t1._COPYING_ | 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveAllocatedBlock(FSNamesystem.java:3657)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8586) Dead Datanode is allocated for write when client is from deadnode

2015-06-15 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586118#comment-14586118
 ] 

Brahma Reddy Battula commented on HDFS-8586:


Workaround can be done by  enabling  *dfs.name.avoid.write.stale.datanode*  
where node is considered as stale( if there is no heartbeat in 30 sec's 
bydefault)..such that it's not allocated for the write...Anyone have some other 
thoughts..?

 Dead Datanode is allocated for write when client is  from deadnode
 --

 Key: HDFS-8586
 URL: https://issues.apache.org/jira/browse/HDFS-8586
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
Priority: Critical

  *{color:blue}DataNode marked as Dead{color}* 
 2015-06-11 19:39:00,862 | INFO  | 
 org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager$Monitor@28ec166e
  | BLOCK*  *removeDeadDatanode: lost heartbeat from XX.XX.39.33:25009*  | 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.removeDeadDatanode(DatanodeManager.java:584)
 2015-06-11 19:39:00,863 | INFO  | 
 org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager$Monitor@28ec166e
  | Removing a node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.remove(NetworkTopology.java:488)
   *{color:blue}Deadnode got Allocated{color}* 
 2015-06-11 19:39:45,148 | WARN  | IPC Server handler 26 on 25000 | The 
 cluster does not contain node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.getDistance(NetworkTopology.java:616)
 2015-06-11 19:39:45,149 | WARN  | IPC Server handler 26 on 25000 | The 
 cluster does not contain node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.getDistance(NetworkTopology.java:616)
 2015-06-11 19:39:45,149 | WARN  | IPC Server handler 26 on 25000 | The 
 cluster does not contain node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.getDistance(NetworkTopology.java:616)
 2015-06-11 19:39:45,149 | WARN  | IPC Server handler 26 on 25000 | The 
 cluster does not contain node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.getDistance(NetworkTopology.java:616)
 2015-06-11 19:39:45,149 | INFO  | IPC Server handler 26 on 25000 | BLOCK*  
 *allocate blk_1073754030_13252* {UCState=UNDER_CONSTRUCTION, 
 truncateBlock=null, primaryNodeIndex=-1, 
 replicas=[ReplicaUC[[DISK]DS-e8d29773-dfc2-4224-b1d6-9b0588bca55e:NORMAL:{color:red}XX.XX.39.33:25009{color}|RBW],
   
 ReplicaUC[[DISK]DS-f7d2ab3c-88f7-470c-9097-84387c0bec83:NORMAL:XX.XX.38.32:25009|RBW],
  ReplicaUC[[DISK]DS-8c2a464a-ac81-4651-890a-dbfd07ddd95f:NORMAL: 
 *XX.XX.38.33:25009|RBW]]* } for /t1._COPYING_ | 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveAllocatedBlock(FSNamesystem.java:3657)
 2015-06-11 19:39:45,191 | INFO  | IPC Server handler 35 on 25000 | BLOCK* 
 allocate blk_1073754031_13253{UCState=UNDER_CONSTRUCTION, truncateBlock=null, 
 primaryNodeIndex=-1, 
 replicas=[ReplicaUC[[DISK]DS-ed8ad579-50c0-4e3e-8780-9776531763b6:NORMAL:XX.XX.39.31:25009|RBW],
  
 ReplicaUC[[DISK]DS-19ddd6da-4a3e-481a-8445-dde5c90aaff3:NORMAL:XX.XX.37.32:25009|RBW],
  ReplicaUC[[DISK]DS-4ce4ce39-4973-42ce-8c7d-cb41f899db85: 
 {{NORMAL:XX.XX.37.33:25009}}   |RBW]]} for /t1._COPYING_ | 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveAllocatedBlock(FSNamesystem.java:3657)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8586) Dead Datanode is allocated for write when client is from deadnode

2015-06-13 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584465#comment-14584465
 ] 

Brahma Reddy Battula commented on HDFS-8586:


 *Test Code to reproduce this bug* 
{code}
public void testDeadDatanodeForBlockLocation() throws Exception {
Configuration conf = new HdfsConfiguration();
conf.setInt(DFSConfigKeys.DFS_NAMENODE_HEARTBEAT_RECHECK_INTERVAL_KEY, 500);
conf.setLong(DFSConfigKeys.DFS_HEARTBEAT_INTERVAL_KEY, 1L);
cluster = new MiniDFSCluster.Builder(conf).numDataNodes(3).build();
cluster.waitActive();

String poolId = cluster.getNamesystem().getBlockPoolId();
// wait for datanode to be marked live
DataNode dn = cluster.getDataNodes().get(0);
DatanodeRegistration reg = 
DataNodeTestUtils.getDNRegistrationForBP(dn, poolId);

DFSTestUtil.waitForDatanodeState(cluster, reg.getDatanodeUuid(), true, 
2);

// Shutdown and wait for data node to be marked dead
dn.shutdown();
DFSTestUtil.waitForDatanodeState(cluster, reg.getDatanodeUuid(), false, 
2);
System.out.println(Dn downXXX:  + dn.getDisplayName());

Path file = new Path(afile);
try (FSDataOutputStream outputStream = cluster.getFileSystem().create(file))
{
  outputStream.writeChars(testContent);
}


BlockLocation block = cluster.getFileSystem().getFileBlockLocations(file, 
0, 10)[0];
System.out.println(Dn down:  + dn.getDisplayName());
for(String node : block.getNames())
{
  System.out.println(node);
  if(node.equals(dn.getDisplayName()))
  {
fail(Not expecting the block in a dead node);
  }
}
  }
{code}
 *Impact which I seen* 
{color:red}The cluster have 9 Datanode,now stop 5. dfs.replications=3. Put 
files to HDFS continuously, but some operations failed.{color} 


I think, Here we can deadnodes also... 

{code}
if (isGoodTarget(storage, blockSize, maxNodesPerRack, considerLoad,
results, avoidStaleNodes, storageType)) {
  results.add(storage);
  // add node and related nodes to excludedNode
  return addToExcludedNodes(storage.getDatanodeDescriptor(), excludedNodes);
}
{code}

 Dead Datanode is allocated for write when client is  from deadnode
 --

 Key: HDFS-8586
 URL: https://issues.apache.org/jira/browse/HDFS-8586
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula

  *{color:blue}DataNode marked as Dead{color}* 
 2015-06-11 19:39:00,862 | INFO  | 
 org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager$Monitor@28ec166e
  | BLOCK*  *removeDeadDatanode: lost heartbeat from XX.XX.39.33:25009*  | 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.removeDeadDatanode(DatanodeManager.java:584)
 2015-06-11 19:39:00,863 | INFO  | 
 org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager$Monitor@28ec166e
  | Removing a node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.remove(NetworkTopology.java:488)
   *{color:blue}Deadnode got Allocated{color}* 
 2015-06-11 19:39:45,148 | WARN  | IPC Server handler 26 on 25000 | The 
 cluster does not contain node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.getDistance(NetworkTopology.java:616)
 2015-06-11 19:39:45,149 | WARN  | IPC Server handler 26 on 25000 | The 
 cluster does not contain node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.getDistance(NetworkTopology.java:616)
 2015-06-11 19:39:45,149 | WARN  | IPC Server handler 26 on 25000 | The 
 cluster does not contain node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.getDistance(NetworkTopology.java:616)
 2015-06-11 19:39:45,149 | WARN  | IPC Server handler 26 on 25000 | The 
 cluster does not contain node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.getDistance(NetworkTopology.java:616)
 2015-06-11 19:39:45,149 | INFO  | IPC Server handler 26 on 25000 | BLOCK*  
 *allocate blk_1073754030_13252* {UCState=UNDER_CONSTRUCTION, 
 truncateBlock=null, primaryNodeIndex=-1, 
 replicas=[ReplicaUC[[DISK]DS-e8d29773-dfc2-4224-b1d6-9b0588bca55e:NORMAL:{color:red}XX.XX.39.33:25009{color}|RBW],
   
 ReplicaUC[[DISK]DS-f7d2ab3c-88f7-470c-9097-84387c0bec83:NORMAL:XX.XX.38.32:25009|RBW],
  ReplicaUC[[DISK]DS-8c2a464a-ac81-4651-890a-dbfd07ddd95f:NORMAL: 
 *XX.XX.38.33:25009|RBW]]* } for /t1._COPYING_ | 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveAllocatedBlock(FSNamesystem.java:3657)
 2015-06-11 19:39:45,191 | INFO  | IPC Server handler 35 on 25000 | BLOCK* 
 allocate blk_1073754031_13253{UCState=UNDER_CONSTRUCTION, truncateBlock=null, 
 primaryNodeIndex=-1, 
 

[jira] [Commented] (HDFS-8586) Dead Datanode is allocated for write when client is from deadnode

2015-06-13 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584502#comment-14584502
 ] 

Brahma Reddy Battula commented on HDFS-8586:


Currently only stale nodes are excluded but here we can consider deadnodes also 
..

 Dead Datanode is allocated for write when client is  from deadnode
 --

 Key: HDFS-8586
 URL: https://issues.apache.org/jira/browse/HDFS-8586
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula

  *{color:blue}DataNode marked as Dead{color}* 
 2015-06-11 19:39:00,862 | INFO  | 
 org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager$Monitor@28ec166e
  | BLOCK*  *removeDeadDatanode: lost heartbeat from XX.XX.39.33:25009*  | 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.removeDeadDatanode(DatanodeManager.java:584)
 2015-06-11 19:39:00,863 | INFO  | 
 org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager$Monitor@28ec166e
  | Removing a node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.remove(NetworkTopology.java:488)
   *{color:blue}Deadnode got Allocated{color}* 
 2015-06-11 19:39:45,148 | WARN  | IPC Server handler 26 on 25000 | The 
 cluster does not contain node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.getDistance(NetworkTopology.java:616)
 2015-06-11 19:39:45,149 | WARN  | IPC Server handler 26 on 25000 | The 
 cluster does not contain node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.getDistance(NetworkTopology.java:616)
 2015-06-11 19:39:45,149 | WARN  | IPC Server handler 26 on 25000 | The 
 cluster does not contain node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.getDistance(NetworkTopology.java:616)
 2015-06-11 19:39:45,149 | WARN  | IPC Server handler 26 on 25000 | The 
 cluster does not contain node: /default/rack3/XX.XX.39.33:25009 | 
 org.apache.hadoop.net.NetworkTopology.getDistance(NetworkTopology.java:616)
 2015-06-11 19:39:45,149 | INFO  | IPC Server handler 26 on 25000 | BLOCK*  
 *allocate blk_1073754030_13252* {UCState=UNDER_CONSTRUCTION, 
 truncateBlock=null, primaryNodeIndex=-1, 
 replicas=[ReplicaUC[[DISK]DS-e8d29773-dfc2-4224-b1d6-9b0588bca55e:NORMAL:{color:red}XX.XX.39.33:25009{color}|RBW],
   
 ReplicaUC[[DISK]DS-f7d2ab3c-88f7-470c-9097-84387c0bec83:NORMAL:XX.XX.38.32:25009|RBW],
  ReplicaUC[[DISK]DS-8c2a464a-ac81-4651-890a-dbfd07ddd95f:NORMAL: 
 *XX.XX.38.33:25009|RBW]]* } for /t1._COPYING_ | 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveAllocatedBlock(FSNamesystem.java:3657)
 2015-06-11 19:39:45,191 | INFO  | IPC Server handler 35 on 25000 | BLOCK* 
 allocate blk_1073754031_13253{UCState=UNDER_CONSTRUCTION, truncateBlock=null, 
 primaryNodeIndex=-1, 
 replicas=[ReplicaUC[[DISK]DS-ed8ad579-50c0-4e3e-8780-9776531763b6:NORMAL:XX.XX.39.31:25009|RBW],
  
 ReplicaUC[[DISK]DS-19ddd6da-4a3e-481a-8445-dde5c90aaff3:NORMAL:XX.XX.37.32:25009|RBW],
  ReplicaUC[[DISK]DS-4ce4ce39-4973-42ce-8c7d-cb41f899db85: 
 {{NORMAL:XX.XX.37.33:25009}}   |RBW]]} for /t1._COPYING_ | 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.saveAllocatedBlock(FSNamesystem.java:3657)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)