[jira] [Commented] (HDFS-4261) TestBalancerWithNodeGroup times out

2013-06-03 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13672844#comment-13672844
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-4261:
--

Sure, let's fix the failure in HDFS-4376.  Thanks for the update.

 TestBalancerWithNodeGroup times out
 ---

 Key: HDFS-4261
 URL: https://issues.apache.org/jira/browse/HDFS-4261
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer
Affects Versions: 1.0.4, 1.1.1, 2.0.2-alpha
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Junping Du
 Fix For: 3.0.0

 Attachments: HDFS-4261-branch-1.patch, HDFS-4261-branch-1-v2.patch, 
 HDFS-4261-branch-2.patch, HDFS-4261.patch, HDFS-4261-v2.patch, 
 HDFS-4261-v3.patch, HDFS-4261-v4.patch, HDFS-4261-v5.patch, 
 HDFS-4261-v6.patch, HDFS-4261-v7.patch, HDFS-4261-v8.patch, jstack-mac-18567, 
 jstack-win-5488, 
 org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup-output.txt.mac,
  
 org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup-output.txt.win,
  test-balancer-with-node-group-timeout.txt


 When I manually ran TestBalancerWithNodeGroup, it always timed out in my 
 machine.  Looking at the Jerkins report [build 
 #3573|https://builds.apache.org/job/PreCommit-HDFS-Build/3573//testReport/org.apache.hadoop.hdfs.server.balancer/],
  TestBalancerWithNodeGroup somehow was skipped so that the problem was not 
 detected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4382) Fix typo MAX_NOT_CHANGED_INTERATIONS

2013-06-03 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-4382:
-

Affects Version/s: (was: 3.0.0)
Fix Version/s: (was: 3.0.0)
   2.1.0-beta

Merged this to branch-2.

 Fix typo MAX_NOT_CHANGED_INTERATIONS
 

 Key: HDFS-4382
 URL: https://issues.apache.org/jira/browse/HDFS-4382
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 2.1.0-beta

 Attachments: hdfs-4382-v1.txt


 Here is an example:
 {code}
 +  if (notChangedIterations = MAX_NOT_CHANGED_INTERATIONS) {
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4261) TestBalancerWithNodeGroup times out

2013-06-03 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-4261:
-

   Resolution: Fixed
Fix Version/s: (was: 3.0.0)
   1.3.0
   2.1.0-beta
   1-win
   Status: Resolved  (was: Patch Available)

Merged this to branch-2 and also committed the branch-1 patch.  Thanks, Junping!

 TestBalancerWithNodeGroup times out
 ---

 Key: HDFS-4261
 URL: https://issues.apache.org/jira/browse/HDFS-4261
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer
Affects Versions: 1.0.4, 1.1.1, 2.0.2-alpha
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Junping Du
 Fix For: 1-win, 2.1.0-beta, 1.3.0

 Attachments: HDFS-4261-branch-1.patch, HDFS-4261-branch-1-v2.patch, 
 HDFS-4261-branch-2.patch, HDFS-4261.patch, HDFS-4261-v2.patch, 
 HDFS-4261-v3.patch, HDFS-4261-v4.patch, HDFS-4261-v5.patch, 
 HDFS-4261-v6.patch, HDFS-4261-v7.patch, HDFS-4261-v8.patch, jstack-mac-18567, 
 jstack-win-5488, 
 org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup-output.txt.mac,
  
 org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup-output.txt.win,
  test-balancer-with-node-group-timeout.txt


 When I manually ran TestBalancerWithNodeGroup, it always timed out in my 
 machine.  Looking at the Jerkins report [build 
 #3573|https://builds.apache.org/job/PreCommit-HDFS-Build/3573//testReport/org.apache.hadoop.hdfs.server.balancer/],
  TestBalancerWithNodeGroup somehow was skipped so that the problem was not 
 detected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4261) TestBalancerWithNodeGroup times out

2013-06-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13672869#comment-13672869
 ] 

Junping Du commented on HDFS-4261:
--

Thanks Nicholas!

 TestBalancerWithNodeGroup times out
 ---

 Key: HDFS-4261
 URL: https://issues.apache.org/jira/browse/HDFS-4261
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer
Affects Versions: 1.0.4, 1.1.1, 2.0.2-alpha
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Junping Du
 Fix For: 1-win, 2.1.0-beta, 1.3.0

 Attachments: HDFS-4261-branch-1.patch, HDFS-4261-branch-1-v2.patch, 
 HDFS-4261-branch-2.patch, HDFS-4261.patch, HDFS-4261-v2.patch, 
 HDFS-4261-v3.patch, HDFS-4261-v4.patch, HDFS-4261-v5.patch, 
 HDFS-4261-v6.patch, HDFS-4261-v7.patch, HDFS-4261-v8.patch, jstack-mac-18567, 
 jstack-win-5488, 
 org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup-output.txt.mac,
  
 org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup-output.txt.win,
  test-balancer-with-node-group-timeout.txt


 When I manually ran TestBalancerWithNodeGroup, it always timed out in my 
 machine.  Looking at the Jerkins report [build 
 #3573|https://builds.apache.org/job/PreCommit-HDFS-Build/3573//testReport/org.apache.hadoop.hdfs.server.balancer/],
  TestBalancerWithNodeGroup somehow was skipped so that the problem was not 
 detected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4860) Add additional attributes to JMX beans

2013-06-03 Thread Trevor Lorimer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Trevor Lorimer updated HDFS-4860:
-

Status: Open  (was: Patch Available)

 Add additional attributes to JMX beans
 --

 Key: HDFS-4860
 URL: https://issues.apache.org/jira/browse/HDFS-4860
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.4-alpha, 0.20.204.1, 3.0.0, 2.1.0-beta
Reporter: Trevor Lorimer
 Attachments: 0001-HDFS-4860.patch


 Currently the JMX bean returns much of the data contained on the HDFS Health 
 webpage (dfsHealth.html). However there are several other attributes that are 
 required to be added.
 I intend to add the following items to the appropriate bean in parenthesis :
 Started time (NameNodeInfo),
 Compiled info (NameNodeInfo),
 Jvm MaxHeap, MaxNonHeap (JvmMetrics)
 Node Usage stats (i.e. Min, Median, Max, stdev) (NameNodeInfo),
 Count of decommissioned Live and Dead nodes (FSNamesystemState),
 Journal Status (NodeNameInfo)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4860) Add additional attributes to JMX beans

2013-06-03 Thread Trevor Lorimer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Trevor Lorimer updated HDFS-4860:
-

Attachment: (was: 0001-HDFS-4860.patch)

 Add additional attributes to JMX beans
 --

 Key: HDFS-4860
 URL: https://issues.apache.org/jira/browse/HDFS-4860
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 0.20.204.1, 3.0.0, 2.1.0-beta, 2.0.4-alpha
Reporter: Trevor Lorimer
 Attachments: 0002-HDFS-4860.patch


 Currently the JMX bean returns much of the data contained on the HDFS Health 
 webpage (dfsHealth.html). However there are several other attributes that are 
 required to be added.
 I intend to add the following items to the appropriate bean in parenthesis :
 Started time (NameNodeInfo),
 Compiled info (NameNodeInfo),
 Jvm MaxHeap, MaxNonHeap (JvmMetrics)
 Node Usage stats (i.e. Min, Median, Max, stdev) (NameNodeInfo),
 Count of decommissioned Live and Dead nodes (FSNamesystemState),
 Journal Status (NodeNameInfo)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4860) Add additional attributes to JMX beans

2013-06-03 Thread Trevor Lorimer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Trevor Lorimer updated HDFS-4860:
-

Attachment: 0002-HDFS-4860.patch

Added message to asserts. The test that was breaking seems to be unrelated to 
my changes.

 Add additional attributes to JMX beans
 --

 Key: HDFS-4860
 URL: https://issues.apache.org/jira/browse/HDFS-4860
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 0.20.204.1, 3.0.0, 2.1.0-beta, 2.0.4-alpha
Reporter: Trevor Lorimer
 Attachments: 0002-HDFS-4860.patch


 Currently the JMX bean returns much of the data contained on the HDFS Health 
 webpage (dfsHealth.html). However there are several other attributes that are 
 required to be added.
 I intend to add the following items to the appropriate bean in parenthesis :
 Started time (NameNodeInfo),
 Compiled info (NameNodeInfo),
 Jvm MaxHeap, MaxNonHeap (JvmMetrics)
 Node Usage stats (i.e. Min, Median, Max, stdev) (NameNodeInfo),
 Count of decommissioned Live and Dead nodes (FSNamesystemState),
 Journal Status (NodeNameInfo)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4860) Add additional attributes to JMX beans

2013-06-03 Thread Trevor Lorimer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Trevor Lorimer updated HDFS-4860:
-

Status: Patch Available  (was: Open)

 Add additional attributes to JMX beans
 --

 Key: HDFS-4860
 URL: https://issues.apache.org/jira/browse/HDFS-4860
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.4-alpha, 0.20.204.1, 3.0.0, 2.1.0-beta
Reporter: Trevor Lorimer
 Attachments: 0002-HDFS-4860.patch


 Currently the JMX bean returns much of the data contained on the HDFS Health 
 webpage (dfsHealth.html). However there are several other attributes that are 
 required to be added.
 I intend to add the following items to the appropriate bean in parenthesis :
 Started time (NameNodeInfo),
 Compiled info (NameNodeInfo),
 Jvm MaxHeap, MaxNonHeap (JvmMetrics)
 Node Usage stats (i.e. Min, Median, Max, stdev) (NameNodeInfo),
 Count of decommissioned Live and Dead nodes (FSNamesystemState),
 Journal Status (NodeNameInfo)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4860) Add additional attributes to JMX beans

2013-06-03 Thread Trevor Lorimer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673070#comment-13673070
 ] 

Trevor Lorimer commented on HDFS-4860:
--

The an example display of the new attributes in NameNodeInfo:

NodeUsage: 
{nodeUsage:{min:1.02%,median:1.02%,max:1.02%,stdDev:0.00%}}

NameJournalStatus: 
[{stream:EditLogFileOutputStream,Required:false,manager:FileJournalManage,streamLocation:(/opt/hadoop/hdfs/namenode/current/edits_inprogress_364),Disabled:false,OpenForWrite:true,managerLocation:(root=/opt/hadoop/hdfs/namenode)}]

NNStarted: Fri May 31 15:29:25 BST 2013

CompileInfo: 2013-05-31T14:14Z by trevorlorimer from hadoop-2.0.4-wdd3.6

 Add additional attributes to JMX beans
 --

 Key: HDFS-4860
 URL: https://issues.apache.org/jira/browse/HDFS-4860
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 0.20.204.1, 3.0.0, 2.1.0-beta, 2.0.4-alpha
Reporter: Trevor Lorimer
 Attachments: 0002-HDFS-4860.patch


 Currently the JMX bean returns much of the data contained on the HDFS Health 
 webpage (dfsHealth.html). However there are several other attributes that are 
 required to be added.
 I intend to add the following items to the appropriate bean in parenthesis :
 Started time (NameNodeInfo),
 Compiled info (NameNodeInfo),
 Jvm MaxHeap, MaxNonHeap (JvmMetrics)
 Node Usage stats (i.e. Min, Median, Max, stdev) (NameNodeInfo),
 Count of decommissioned Live and Dead nodes (FSNamesystemState),
 Journal Status (NodeNameInfo)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4860) Add additional attributes to JMX beans

2013-06-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673171#comment-13673171
 ] 

Hadoop QA commented on HDFS-4860:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12585840/0002-HDFS-4860.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/4468//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4468//console

This message is automatically generated.

 Add additional attributes to JMX beans
 --

 Key: HDFS-4860
 URL: https://issues.apache.org/jira/browse/HDFS-4860
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 0.20.204.1, 3.0.0, 2.1.0-beta, 2.0.4-alpha
Reporter: Trevor Lorimer
 Attachments: 0002-HDFS-4860.patch


 Currently the JMX bean returns much of the data contained on the HDFS Health 
 webpage (dfsHealth.html). However there are several other attributes that are 
 required to be added.
 I intend to add the following items to the appropriate bean in parenthesis :
 Started time (NameNodeInfo),
 Compiled info (NameNodeInfo),
 Jvm MaxHeap, MaxNonHeap (JvmMetrics)
 Node Usage stats (i.e. Min, Median, Max, stdev) (NameNodeInfo),
 Count of decommissioned Live and Dead nodes (FSNamesystemState),
 Journal Status (NodeNameInfo)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4832) Namenode doesn't change the number of missing blocks in safemode when DNs rejoin or leave

2013-06-03 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HDFS-4832:
---

Status: Open  (was: Patch Available)

The patch passed test-patch.sh on my machine several times. Rolling the dice 
again.

 Namenode doesn't change the number of missing blocks in safemode when DNs 
 rejoin or leave
 -

 Key: HDFS-4832
 URL: https://issues.apache.org/jira/browse/HDFS-4832
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.23.7, 3.0.0, 2.1.0-beta
Reporter: Ravi Prakash
Assignee: Ravi Prakash
Priority: Critical
 Attachments: HDFS-4832.patch, HDFS-4832.patch, HDFS-4832.patch, 
 HDFS-4832.patch


 Courtesy Karri VRK Reddy!
 {quote}
 1. Namenode lost datanodes causing missing blocks
 2. Namenode was put in safe mode
 3. Datanode restarted on dead nodes 
 4. Waited for lots of time for the NN UI to reflect the recovered blocks.
 5. Forced NN out of safe mode and suddenly,  no more missing blocks anymore.
 {quote}
 I was able to replicate this on 0.23 and trunk. I set 
 dfs.namenode.heartbeat.recheck-interval to 1 and killed the DN to simulate 
 lost datanode. The opposite case also has problems (i.e. Datanode failing 
 when NN is in safemode, doesn't lead to a missing blocks message)
 Without the NN updating this list of missing blocks, the grid admins will not 
 know when to take the cluster out of safemode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4832) Namenode doesn't change the number of missing blocks in safemode when DNs rejoin or leave

2013-06-03 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HDFS-4832:
---

Status: Patch Available  (was: Open)

 Namenode doesn't change the number of missing blocks in safemode when DNs 
 rejoin or leave
 -

 Key: HDFS-4832
 URL: https://issues.apache.org/jira/browse/HDFS-4832
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.23.7, 3.0.0, 2.1.0-beta
Reporter: Ravi Prakash
Assignee: Ravi Prakash
Priority: Critical
 Attachments: HDFS-4832.patch, HDFS-4832.patch, HDFS-4832.patch, 
 HDFS-4832.patch


 Courtesy Karri VRK Reddy!
 {quote}
 1. Namenode lost datanodes causing missing blocks
 2. Namenode was put in safe mode
 3. Datanode restarted on dead nodes 
 4. Waited for lots of time for the NN UI to reflect the recovered blocks.
 5. Forced NN out of safe mode and suddenly,  no more missing blocks anymore.
 {quote}
 I was able to replicate this on 0.23 and trunk. I set 
 dfs.namenode.heartbeat.recheck-interval to 1 and killed the DN to simulate 
 lost datanode. The opposite case also has problems (i.e. Datanode failing 
 when NN is in safemode, doesn't lead to a missing blocks message)
 Without the NN updating this list of missing blocks, the grid admins will not 
 know when to take the cluster out of safemode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3934) duplicative dfs_hosts entries handled wrong

2013-06-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673334#comment-13673334
 ] 

Hudson commented on HDFS-3934:
--

Integrated in Hadoop-trunk-Commit #3840 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3840/])
HDFS-3934. duplicative dfs_hosts entries handled wrong. (cmccabe) (Revision 
1489065)

 Result = FAILURE
cmccabe : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489065
Files : 
* 
/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/HostsFileReader.java
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDatanodeRegistration.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommission.java


 duplicative dfs_hosts entries handled wrong
 ---

 Key: HDFS-3934
 URL: https://issues.apache.org/jira/browse/HDFS-3934
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.1-alpha
Reporter: Andy Isaacson
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-3934.001.patch, HDFS-3934.002.patch, 
 HDFS-3934.003.patch, HDFS-3934.004.patch, HDFS-3934.005.patch, 
 HDFS-3934.006.patch, HDFS-3934.007.patch, HDFS-3934.008.patch, 
 HDFS-3934.010.patch, HDFS-3934.011.patch, HDFS-3934.012.patch, 
 HDFS-3934.013.patch, HDFS-3934.014.patch, HDFS-3934.015.patch, 
 HDFS-3934.016.patch, HDFS-3934.017.patch


 A dead DN listed in dfs_hosts_allow.txt by IP and in dfs_hosts_exclude.txt by 
 hostname ends up being displayed twice in {{dfsnodelist.jsp?whatNodes=DEAD}} 
 after the NN restarts because {{getDatanodeListForReport}} does not handle 
 such a pseudo-duplicate correctly:
 # the Remove any nodes we know about from the map loop no longer has the 
 knowledge to remove the spurious entries
 # the The remaining nodes are ones that are referenced by the hosts files 
 loop does not do hostname lookups, so does not know that the IP and hostname 
 refer to the same host.
 Relatedly, such an IP-based dfs_hosts entry results in a cosmetic problem in 
 the JSP output:  The *Node* column shows :50010 as the nodename, with HTML 
 markup {{a 
 href=http://:50075/browseDirectory.jsp?namenodeInfoPort=50070amp;dir=%2Famp;nnaddr=172.29.97.196:8020;
  title=172.29.97.216:50010:50010/a}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3934) duplicative dfs_hosts entries handled wrong

2013-06-03 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673340#comment-13673340
 ] 

Colin Patrick McCabe commented on HDFS-3934:


I talked to Daryn offline about this and he said he was ok with this going in, 
though he didn't have time this week to re-review.

 duplicative dfs_hosts entries handled wrong
 ---

 Key: HDFS-3934
 URL: https://issues.apache.org/jira/browse/HDFS-3934
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.1-alpha
Reporter: Andy Isaacson
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-3934.001.patch, HDFS-3934.002.patch, 
 HDFS-3934.003.patch, HDFS-3934.004.patch, HDFS-3934.005.patch, 
 HDFS-3934.006.patch, HDFS-3934.007.patch, HDFS-3934.008.patch, 
 HDFS-3934.010.patch, HDFS-3934.011.patch, HDFS-3934.012.patch, 
 HDFS-3934.013.patch, HDFS-3934.014.patch, HDFS-3934.015.patch, 
 HDFS-3934.016.patch, HDFS-3934.017.patch


 A dead DN listed in dfs_hosts_allow.txt by IP and in dfs_hosts_exclude.txt by 
 hostname ends up being displayed twice in {{dfsnodelist.jsp?whatNodes=DEAD}} 
 after the NN restarts because {{getDatanodeListForReport}} does not handle 
 such a pseudo-duplicate correctly:
 # the Remove any nodes we know about from the map loop no longer has the 
 knowledge to remove the spurious entries
 # the The remaining nodes are ones that are referenced by the hosts files 
 loop does not do hostname lookups, so does not know that the IP and hostname 
 refer to the same host.
 Relatedly, such an IP-based dfs_hosts entry results in a cosmetic problem in 
 the JSP output:  The *Node* column shows :50010 as the nodename, with HTML 
 markup {{a 
 href=http://:50075/browseDirectory.jsp?namenodeInfoPort=50070amp;dir=%2Famp;nnaddr=172.29.97.196:8020;
  title=172.29.97.216:50010:50010/a}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4870) periodically re-resolve hostnames in included and excluded datanodes list

2013-06-03 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HDFS-4870:
--

 Summary: periodically re-resolve hostnames in included and 
excluded datanodes list
 Key: HDFS-4870
 URL: https://issues.apache.org/jira/browse/HDFS-4870
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Colin Patrick McCabe
Priority: Minor


We currently only resolve the hostnames in the included and excluded datanodes 
list once-- when the list is read.  The rationale for this is that in big 
clusters, DNS resolution for thousands of nodes can take a long time (when 
generating a datanode list in getDatanodeListForReport, for example).  However, 
if the DNS information changes for one of these hosts, we should reflect that.  
A background thread could do these DNS resolutions every few minutes without 
blocking any foreground operations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3934) duplicative dfs_hosts entries handled wrong

2013-06-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673349#comment-13673349
 ] 

Hudson commented on HDFS-3934:
--

Integrated in Hadoop-trunk-Commit #3841 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3841/])
Add needed file for HDFS-3934 (cmccabe) (Revision 1489068)

 Result = SUCCESS
cmccabe : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489068
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/HostFileManager.java


 duplicative dfs_hosts entries handled wrong
 ---

 Key: HDFS-3934
 URL: https://issues.apache.org/jira/browse/HDFS-3934
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.1-alpha
Reporter: Andy Isaacson
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-3934.001.patch, HDFS-3934.002.patch, 
 HDFS-3934.003.patch, HDFS-3934.004.patch, HDFS-3934.005.patch, 
 HDFS-3934.006.patch, HDFS-3934.007.patch, HDFS-3934.008.patch, 
 HDFS-3934.010.patch, HDFS-3934.011.patch, HDFS-3934.012.patch, 
 HDFS-3934.013.patch, HDFS-3934.014.patch, HDFS-3934.015.patch, 
 HDFS-3934.016.patch, HDFS-3934.017.patch


 A dead DN listed in dfs_hosts_allow.txt by IP and in dfs_hosts_exclude.txt by 
 hostname ends up being displayed twice in {{dfsnodelist.jsp?whatNodes=DEAD}} 
 after the NN restarts because {{getDatanodeListForReport}} does not handle 
 such a pseudo-duplicate correctly:
 # the Remove any nodes we know about from the map loop no longer has the 
 knowledge to remove the spurious entries
 # the The remaining nodes are ones that are referenced by the hosts files 
 loop does not do hostname lookups, so does not know that the IP and hostname 
 refer to the same host.
 Relatedly, such an IP-based dfs_hosts entry results in a cosmetic problem in 
 the JSP output:  The *Node* column shows :50010 as the nodename, with HTML 
 markup {{a 
 href=http://:50075/browseDirectory.jsp?namenodeInfoPort=50070amp;dir=%2Famp;nnaddr=172.29.97.196:8020;
  title=172.29.97.216:50010:50010/a}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3934) duplicative dfs_hosts entries handled wrong

2013-06-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673375#comment-13673375
 ] 

Hudson commented on HDFS-3934:
--

Integrated in Hadoop-trunk-Commit #3843 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3843/])
Remove extra code that code erroneously committed in HDFS-3934 (cmccabe) 
(Revision 1489083)

 Result = SUCCESS
cmccabe : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1489083
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/HostFileManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommission.java


 duplicative dfs_hosts entries handled wrong
 ---

 Key: HDFS-3934
 URL: https://issues.apache.org/jira/browse/HDFS-3934
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.1-alpha
Reporter: Andy Isaacson
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-3934.001.patch, HDFS-3934.002.patch, 
 HDFS-3934.003.patch, HDFS-3934.004.patch, HDFS-3934.005.patch, 
 HDFS-3934.006.patch, HDFS-3934.007.patch, HDFS-3934.008.patch, 
 HDFS-3934.010.patch, HDFS-3934.011.patch, HDFS-3934.012.patch, 
 HDFS-3934.013.patch, HDFS-3934.014.patch, HDFS-3934.015.patch, 
 HDFS-3934.016.patch, HDFS-3934.017.patch


 A dead DN listed in dfs_hosts_allow.txt by IP and in dfs_hosts_exclude.txt by 
 hostname ends up being displayed twice in {{dfsnodelist.jsp?whatNodes=DEAD}} 
 after the NN restarts because {{getDatanodeListForReport}} does not handle 
 such a pseudo-duplicate correctly:
 # the Remove any nodes we know about from the map loop no longer has the 
 knowledge to remove the spurious entries
 # the The remaining nodes are ones that are referenced by the hosts files 
 loop does not do hostname lookups, so does not know that the IP and hostname 
 refer to the same host.
 Relatedly, such an IP-based dfs_hosts entry results in a cosmetic problem in 
 the JSP output:  The *Node* column shows :50010 as the nodename, with HTML 
 markup {{a 
 href=http://:50075/browseDirectory.jsp?namenodeInfoPort=50070amp;dir=%2Famp;nnaddr=172.29.97.196:8020;
  title=172.29.97.216:50010:50010/a}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4870) periodically re-resolve hostnames in included and excluded datanodes list

2013-06-03 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-4870:
---

 Assignee: Colin Patrick McCabe
Affects Version/s: 2.0.5-alpha
   Status: Patch Available  (was: Open)

 periodically re-resolve hostnames in included and excluded datanodes list
 -

 Key: HDFS-4870
 URL: https://issues.apache.org/jira/browse/HDFS-4870
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.5-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-4870.001.patch


 We currently only resolve the hostnames in the included and excluded 
 datanodes list once-- when the list is read.  The rationale for this is that 
 in big clusters, DNS resolution for thousands of nodes can take a long time 
 (when generating a datanode list in getDatanodeListForReport, for example).  
 However, if the DNS information changes for one of these hosts, we should 
 reflect that.  A background thread could do these DNS resolutions every few 
 minutes without blocking any foreground operations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4870) periodically re-resolve hostnames in included and excluded datanodes list

2013-06-03 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-4870:
---

Attachment: HDFS-4870.001.patch

 periodically re-resolve hostnames in included and excluded datanodes list
 -

 Key: HDFS-4870
 URL: https://issues.apache.org/jira/browse/HDFS-4870
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-4870.001.patch


 We currently only resolve the hostnames in the included and excluded 
 datanodes list once-- when the list is read.  The rationale for this is that 
 in big clusters, DNS resolution for thousands of nodes can take a long time 
 (when generating a datanode list in getDatanodeListForReport, for example).  
 However, if the DNS information changes for one of these hosts, we should 
 reflect that.  A background thread could do these DNS resolutions every few 
 minutes without blocking any foreground operations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4860) Add additional attributes to JMX beans

2013-06-03 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673378#comment-13673378
 ] 

Todd Lipcon commented on HDFS-4860:
---

It looks like your substring math is off -- manager:FileJournalManage

Plus I think it's awfully hacky to assume that the toString() of a manager 
happens to have this particular format... why not add something like 
JournalManager.generateAttributeMap() so that each JM implementation can 
include its appropriate statistics in JMX without this string parsing hackery?

Also, this code is unnecessarily verbose:
{code}
+
+  if (jas.isDisabled()) {
+jasMap.put(Disabled, Boolean.TRUE.toString());
+  } else {
+jasMap.put(Disabled, Boolean.FALSE.toString());
+  }
{code}

You could just do:
{code}
jasMap.put(Disabled, String.valueOf(jas.isDisabled()))
{code}


 Add additional attributes to JMX beans
 --

 Key: HDFS-4860
 URL: https://issues.apache.org/jira/browse/HDFS-4860
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 0.20.204.1, 3.0.0, 2.1.0-beta, 2.0.4-alpha
Reporter: Trevor Lorimer
 Attachments: 0002-HDFS-4860.patch


 Currently the JMX bean returns much of the data contained on the HDFS Health 
 webpage (dfsHealth.html). However there are several other attributes that are 
 required to be added.
 I intend to add the following items to the appropriate bean in parenthesis :
 Started time (NameNodeInfo),
 Compiled info (NameNodeInfo),
 Jvm MaxHeap, MaxNonHeap (JvmMetrics)
 Node Usage stats (i.e. Min, Median, Max, stdev) (NameNodeInfo),
 Count of decommissioned Live and Dead nodes (FSNamesystemState),
 Journal Status (NodeNameInfo)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4832) Namenode doesn't change the number of missing blocks in safemode when DNs rejoin or leave

2013-06-03 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673388#comment-13673388
 ] 

Ravi Prakash commented on HDFS-4832:


Hi Kihwal, that change was made in 
https://issues.apache.org/jira/browse/HDFS-1295 . Matt reports some statistics 
there. Please let me know if its worthwhile to take that performance hit to 
report the correct block status.

 Namenode doesn't change the number of missing blocks in safemode when DNs 
 rejoin or leave
 -

 Key: HDFS-4832
 URL: https://issues.apache.org/jira/browse/HDFS-4832
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 0.23.7, 2.1.0-beta
Reporter: Ravi Prakash
Assignee: Ravi Prakash
Priority: Critical
 Attachments: HDFS-4832.patch, HDFS-4832.patch, HDFS-4832.patch, 
 HDFS-4832.patch


 Courtesy Karri VRK Reddy!
 {quote}
 1. Namenode lost datanodes causing missing blocks
 2. Namenode was put in safe mode
 3. Datanode restarted on dead nodes 
 4. Waited for lots of time for the NN UI to reflect the recovered blocks.
 5. Forced NN out of safe mode and suddenly,  no more missing blocks anymore.
 {quote}
 I was able to replicate this on 0.23 and trunk. I set 
 dfs.namenode.heartbeat.recheck-interval to 1 and killed the DN to simulate 
 lost datanode. The opposite case also has problems (i.e. Datanode failing 
 when NN is in safemode, doesn't lead to a missing blocks message)
 Without the NN updating this list of missing blocks, the grid admins will not 
 know when to take the cluster out of safemode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4871) Skip failing commons tests on Windows

2013-06-03 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDFS-4871:
---

 Summary: Skip failing commons tests on Windows
 Key: HDFS-4871
 URL: https://issues.apache.org/jira/browse/HDFS-4871
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Fix For: 2.1.0-beta


This is a temporary fix proposed to get CI working. We will skip the following 
failing tests on Windows:

# TestChRootedFs
# TestFSMainOperationsLocalFileSystem
# TestFcCreateMkdirLocalFs
# TestFcMainOperationsLocalFs
# TestFcPermissionsLocalFs
# TestLocalFSFileContextSymlink
# TestLocalFileSystem
# TestShellCommandFencer
# TestSocketIOWithTimeout
# TestViewFsLocalFs
# TestViewFsTrash
# TestViewFsWithAuthorityLocalFs

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4871) Skip failing commons tests on Windows

2013-06-03 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-4871:


Description: 
This is a temporary fix proposed to get CI working. We will skip the following 
failing tests on Windows:

# TestChRootedFs
# TestFSMainOperationsLocalFileSystem
# TestFcCreateMkdirLocalFs
# TestFcMainOperationsLocalFs
# TestFcPermissionsLocalFs
# TestLocalFSFileContextSymlink - HADOOP-9527
# TestLocalFileSystem
# TestShellCommandFencer - HADOOP-9526
# TestSocketIOWithTimeout - HADOOP-8982
# TestViewFsLocalFs
# TestViewFsTrash
# TestViewFsWithAuthorityLocalFs

The tests will be re-enabled as we fix each. JIRAs for remaining failing tests 
to follow soon.

  was:
This is a temporary fix proposed to get CI working. We will skip the following 
failing tests on Windows:

# TestChRootedFs
# TestFSMainOperationsLocalFileSystem
# TestFcCreateMkdirLocalFs
# TestFcMainOperationsLocalFs
# TestFcPermissionsLocalFs
# TestLocalFSFileContextSymlink
# TestLocalFileSystem
# TestShellCommandFencer
# TestSocketIOWithTimeout
# TestViewFsLocalFs
# TestViewFsTrash
# TestViewFsWithAuthorityLocalFs


 Skip failing commons tests on Windows
 -

 Key: HDFS-4871
 URL: https://issues.apache.org/jira/browse/HDFS-4871
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Fix For: 2.1.0-beta


 This is a temporary fix proposed to get CI working. We will skip the 
 following failing tests on Windows:
 # TestChRootedFs
 # TestFSMainOperationsLocalFileSystem
 # TestFcCreateMkdirLocalFs
 # TestFcMainOperationsLocalFs
 # TestFcPermissionsLocalFs
 # TestLocalFSFileContextSymlink - HADOOP-9527
 # TestLocalFileSystem
 # TestShellCommandFencer - HADOOP-9526
 # TestSocketIOWithTimeout - HADOOP-8982
 # TestViewFsLocalFs
 # TestViewFsTrash
 # TestViewFsWithAuthorityLocalFs
 The tests will be re-enabled as we fix each. JIRAs for remaining failing 
 tests to follow soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4870) periodically re-resolve hostnames in included and excluded datanodes list

2013-06-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673490#comment-13673490
 ] 

Hadoop QA commented on HDFS-4870:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12585903/HDFS-4870.001.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/4469//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4469//console

This message is automatically generated.

 periodically re-resolve hostnames in included and excluded datanodes list
 -

 Key: HDFS-4870
 URL: https://issues.apache.org/jira/browse/HDFS-4870
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.5-alpha
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-4870.001.patch


 We currently only resolve the hostnames in the included and excluded 
 datanodes list once-- when the list is read.  The rationale for this is that 
 in big clusters, DNS resolution for thousands of nodes can take a long time 
 (when generating a datanode list in getDatanodeListForReport, for example).  
 However, if the DNS information changes for one of these hosts, we should 
 reflect that.  A background thread could do these DNS resolutions every few 
 minutes without blocking any foreground operations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4849) Idempotent create and append operations.

2013-06-03 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-4849:
--

Summary: Idempotent create and append operations.  (was: Idempotent create, 
append and delete operations.)

 Idempotent create and append operations.
 

 Key: HDFS-4849
 URL: https://issues.apache.org/jira/browse/HDFS-4849
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.4-alpha
Reporter: Konstantin Shvachko
Assignee: Konstantin Shvachko

 create, append and delete operations can be made idempotent. This will reduce 
 chances for a job or other app failures when NN fails over.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4872) Idempotent delete operation.

2013-06-03 Thread Konstantin Shvachko (JIRA)
Konstantin Shvachko created HDFS-4872:
-

 Summary: Idempotent delete operation.
 Key: HDFS-4872
 URL: https://issues.apache.org/jira/browse/HDFS-4872
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.4-alpha
Reporter: Konstantin Shvachko


Making delete idempotent is important to provide uninterrupted job execution in 
case of HA failover.
This is to discuss different approaches to idempotent implementation of delete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4867) metaSave NPEs when there are invalid blocks in repl queue.

2013-06-03 Thread Plamen Jeliazkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Plamen Jeliazkov updated HDFS-4867:
---

Attachment: HDFS-4867.trunk.patch

Attaching patch with unit test to print orphaned blocks from metaSave. This 
will fix the immediate issue but I struggle to understand WHY this is happening 
in the first place...

I am able to simulate orphaned blocks in the unit test by deleting the created 
file immediately before metaSave is called.

 metaSave NPEs when there are invalid blocks in repl queue.
 --

 Key: HDFS-4867
 URL: https://issues.apache.org/jira/browse/HDFS-4867
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 0.23.7, 2.0.4-alpha
Reporter: Kihwal Lee
Assignee: Ravi Prakash
 Attachments: HDFS-4867.trunk.patch


 Since metaSave cannot get the inode holding a orphaned/invalid block, it NPEs 
 and stops generating further report. Normally ReplicationMonitor removes them 
 quickly, but if the queue is huge, it takes very long time. Also in safe 
 mode, they stay.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HDFS-4867) metaSave NPEs when there are invalid blocks in repl queue.

2013-06-03 Thread Plamen Jeliazkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Plamen Jeliazkov reassigned HDFS-4867:
--

Assignee: Plamen Jeliazkov  (was: Ravi Prakash)

 metaSave NPEs when there are invalid blocks in repl queue.
 --

 Key: HDFS-4867
 URL: https://issues.apache.org/jira/browse/HDFS-4867
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 0.23.7, 2.0.4-alpha
Reporter: Kihwal Lee
Assignee: Plamen Jeliazkov
 Attachments: HDFS-4867.trunk.patch


 Since metaSave cannot get the inode holding a orphaned/invalid block, it NPEs 
 and stops generating further report. Normally ReplicationMonitor removes them 
 quickly, but if the queue is huge, it takes very long time. Also in safe 
 mode, they stay.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4867) metaSave NPEs when there are invalid blocks in repl queue.

2013-06-03 Thread Plamen Jeliazkov (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673553#comment-13673553
 ] 

Plamen Jeliazkov commented on HDFS-4867:


Ravi, I am going to take this issue up. If you would like to take it back 
please let me know and I will back off.

 metaSave NPEs when there are invalid blocks in repl queue.
 --

 Key: HDFS-4867
 URL: https://issues.apache.org/jira/browse/HDFS-4867
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 0.23.7, 2.0.4-alpha
Reporter: Kihwal Lee
Assignee: Plamen Jeliazkov
 Attachments: HDFS-4867.trunk.patch


 Since metaSave cannot get the inode holding a orphaned/invalid block, it NPEs 
 and stops generating further report. Normally ReplicationMonitor removes them 
 quickly, but if the queue is huge, it takes very long time. Also in safe 
 mode, they stay.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4873) callGetBlockLocations returns incorrect number of blocks for snapshotted files

2013-06-03 Thread Hari Mankude (JIRA)
Hari Mankude created HDFS-4873:
--

 Summary: callGetBlockLocations returns incorrect number of blocks 
for snapshotted files
 Key: HDFS-4873
 URL: https://issues.apache.org/jira/browse/HDFS-4873
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Affects Versions: 3.0.0
Reporter: Hari Mankude
Assignee: Jing Zhao


callGetBlockLocations() returns all the blocks of a file even when they are not 
present in the snap version

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4867) metaSave NPEs when there are invalid blocks in repl queue.

2013-06-03 Thread Plamen Jeliazkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Plamen Jeliazkov updated HDFS-4867:
---

Fix Version/s: 3.0.0
   Status: Patch Available  (was: Open)

 metaSave NPEs when there are invalid blocks in repl queue.
 --

 Key: HDFS-4867
 URL: https://issues.apache.org/jira/browse/HDFS-4867
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.0.4-alpha, 0.23.7
Reporter: Kihwal Lee
Assignee: Plamen Jeliazkov
 Fix For: 3.0.0

 Attachments: HDFS-4867.trunk.patch


 Since metaSave cannot get the inode holding a orphaned/invalid block, it NPEs 
 and stops generating further report. Normally ReplicationMonitor removes them 
 quickly, but if the queue is huge, it takes very long time. Also in safe 
 mode, they stay.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4873) callGetBlockLocations returns incorrect number of blocks for snapshotted files

2013-06-03 Thread Hari Mankude (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673558#comment-13673558
 ] 

Hari Mankude commented on HDFS-4873:


The sequence of operations for creating the problem

1. create a file of size one block
2. take a snapshot
3. append some data to this file.
4. use DfsClient.callGetBlockLocations() to get block locations of the snapshot 
version of the file. The file len is specified as Long.MAX_VALUE.
5. This call returns two LocatedBlocks for the snapshot version of the file 
instead of one block.

 callGetBlockLocations returns incorrect number of blocks for snapshotted files
 --

 Key: HDFS-4873
 URL: https://issues.apache.org/jira/browse/HDFS-4873
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Affects Versions: 3.0.0
Reporter: Hari Mankude
Assignee: Jing Zhao

 callGetBlockLocations() returns all the blocks of a file even when they are 
 not present in the snap version

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4873) callGetBlockLocations returns incorrect number of blocks for snapshotted files

2013-06-03 Thread Hari Mankude (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673562#comment-13673562
 ] 

Hari Mankude commented on HDFS-4873:


Looks like the problem is in getBlockLocationsUpdateTimes() where length is not 
truncated to fileSize before calling createLocatedBlocks(). There are other 
solutions possible if snap inode is passed in.

 callGetBlockLocations returns incorrect number of blocks for snapshotted files
 --

 Key: HDFS-4873
 URL: https://issues.apache.org/jira/browse/HDFS-4873
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Affects Versions: 3.0.0
Reporter: Hari Mankude
Assignee: Jing Zhao

 callGetBlockLocations() returns all the blocks of a file even when they are 
 not present in the snap version

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4874) create with OVERWRITE deletes existing file without checking the lease: feature or a bug.

2013-06-03 Thread Konstantin Shvachko (JIRA)
Konstantin Shvachko created HDFS-4874:
-

 Summary: create with OVERWRITE deletes existing file without 
checking the lease: feature or a bug.
 Key: HDFS-4874
 URL: https://issues.apache.org/jira/browse/HDFS-4874
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.0.4-alpha
Reporter: Konstantin Shvachko


create with OVERWRITE flag will remove a file under construction even if the 
issuing client does not hold a lease on the file.
It could be a bug or the feature that applications rely upon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4867) metaSave NPEs when there are invalid blocks in repl queue.

2013-06-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673725#comment-13673725
 ] 

Hadoop QA commented on HDFS-4867:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12585941/HDFS-4867.trunk.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/4470//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4470//console

This message is automatically generated.

 metaSave NPEs when there are invalid blocks in repl queue.
 --

 Key: HDFS-4867
 URL: https://issues.apache.org/jira/browse/HDFS-4867
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 0.23.7, 2.0.4-alpha
Reporter: Kihwal Lee
Assignee: Plamen Jeliazkov
 Fix For: 3.0.0

 Attachments: HDFS-4867.trunk.patch


 Since metaSave cannot get the inode holding a orphaned/invalid block, it NPEs 
 and stops generating further report. Normally ReplicationMonitor removes them 
 quickly, but if the queue is huge, it takes very long time. Also in safe 
 mode, they stay.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4867) metaSave NPEs when there are invalid blocks in repl queue.

2013-06-03 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673794#comment-13673794
 ] 

Ravi Prakash commented on HDFS-4867:


Hi Plamen, Please feel free to take this up.

 metaSave NPEs when there are invalid blocks in repl queue.
 --

 Key: HDFS-4867
 URL: https://issues.apache.org/jira/browse/HDFS-4867
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 0.23.7, 2.0.4-alpha
Reporter: Kihwal Lee
Assignee: Plamen Jeliazkov
 Fix For: 3.0.0

 Attachments: HDFS-4867.trunk.patch


 Since metaSave cannot get the inode holding a orphaned/invalid block, it NPEs 
 and stops generating further report. Normally ReplicationMonitor removes them 
 quickly, but if the queue is huge, it takes very long time. Also in safe 
 mode, they stay.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4862) SafeModeInfo.isManual() returns true when resources are low even if it wasn't entered into manually

2013-06-03 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated HDFS-4862:
---

Attachment: HDFS-4862.patch

Hi Kihwal! There already exists a method to check for low resources 
(areResourcesLow()). So I don't understand why we need to club that in 
isManual. To me isManual clearly means that the safemode was entered into 
manually. Moreover I could also argue that the NN should be taken out of 
low-resource safemode automatically when ResourceMonitor detects adequate 
resources, so it may not necessarily be a manual step. 
This is a patch which IMHO corrects these behaviors. Could you please review it?

 SafeModeInfo.isManual() returns true when resources are low even if it wasn't 
 entered into manually
 ---

 Key: HDFS-4862
 URL: https://issues.apache.org/jira/browse/HDFS-4862
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha
Reporter: Ravi Prakash
 Attachments: HDFS-4862.patch


 HDFS-1594 changed isManual to this
 {code}
 private boolean isManual() {
   return extension == Integer.MAX_VALUE  !resourcesLow;
 }
 {code}
 One immediate impact of this is that when resources are low, the NN will 
 throw away all block reports from DNs. This is undesirable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4867) metaSave NPEs when there are invalid blocks in repl queue.

2013-06-03 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13673904#comment-13673904
 ] 

Konstantin Shvachko commented on HDFS-4867:
---

metaSave is probably a casualty here. Should we take a look at why orphaned / 
missing blocks are kept in replication queues in the first place?
It seems that when we delete a file blocks can also be removed from replication 
queue, because what is the point of replicating them if they don't belong to 
any files.

It still makes sense to have this case covered in metaSave().
The patch looks good. Couple of nits:
# Could you remove 3 unused imports in the test.
# Also it would be good to close BufferedReader in the end of both test cases.

 metaSave NPEs when there are invalid blocks in repl queue.
 --

 Key: HDFS-4867
 URL: https://issues.apache.org/jira/browse/HDFS-4867
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 0.23.7, 2.0.4-alpha
Reporter: Kihwal Lee
Assignee: Plamen Jeliazkov
 Fix For: 3.0.0

 Attachments: HDFS-4867.trunk.patch


 Since metaSave cannot get the inode holding a orphaned/invalid block, it NPEs 
 and stops generating further report. Normally ReplicationMonitor removes them 
 quickly, but if the queue is huge, it takes very long time. Also in safe 
 mode, they stay.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4874) create with OVERWRITE deletes existing file without checking the lease: feature or a bug.

2013-06-03 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674059#comment-13674059
 ] 

Suresh Srinivas commented on HDFS-4874:
---

I think the current behavior is the right one. Overwrite flag indicates that if 
a file already exists, it needs to be overwritten, irrespective of if a file is 
in complete state or being written state.

My vote would be to close this as Not a problem.

 create with OVERWRITE deletes existing file without checking the lease: 
 feature or a bug.
 -

 Key: HDFS-4874
 URL: https://issues.apache.org/jira/browse/HDFS-4874
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.0.4-alpha
Reporter: Konstantin Shvachko

 create with OVERWRITE flag will remove a file under construction even if the 
 issuing client does not hold a lease on the file.
 It could be a bug or the feature that applications rely upon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4872) Idempotent delete operation.

2013-06-03 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674065#comment-13674065
 ] 

Suresh Srinivas commented on HDFS-4872:
---

bq. Add modTime parameter to delete operation with the meaning that the object 
is deleted only if its modification time is = than modTime parameter.
How is time synchronization between client and server done?

Another approach, use inode id to delete a file. But this has the disadvantage 
of client having to know the inode id before issuing delete.

 Idempotent delete operation.
 

 Key: HDFS-4872
 URL: https://issues.apache.org/jira/browse/HDFS-4872
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.4-alpha
Reporter: Konstantin Shvachko

 Making delete idempotent is important to provide uninterrupted job execution 
 in case of HA failover.
 This is to discuss different approaches to idempotent implementation of 
 delete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4872) Idempotent delete operation.

2013-06-03 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674068#comment-13674068
 ] 

Suresh Srinivas commented on HDFS-4872:
---

{quote}
Just mark delete idempotent.
A delete retry may delete an object that has been recreated or replaced between 
the retries in this case.
{quote}
I am -1 on this.

{quote}
Replace delete with idempotent rename to a temporary object, then delete the 
latter with non-idempotent delete.
See the beginning of this comment.
{quote}
Since this requires two requests - one for rename and then delete, the better 
approach is to get inode ID and then delete a file using inode ID. delete with 
unique inode ID is idempotent.


 Idempotent delete operation.
 

 Key: HDFS-4872
 URL: https://issues.apache.org/jira/browse/HDFS-4872
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.0.4-alpha
Reporter: Konstantin Shvachko

 Making delete idempotent is important to provide uninterrupted job execution 
 in case of HA failover.
 This is to discuss different approaches to idempotent implementation of 
 delete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira