[jira] [Updated] (HDFS-2303) jsvc needs to be recompilable

2012-03-09 Thread Mingjie Lai (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingjie Lai updated HDFS-2303:
--

Attachment: HDFS-2303-5-trunk.patch

Eli. Thanks for your review. I updated the patch and attached it here. 

 jsvc needs to be recompilable
 -

 Key: HDFS-2303
 URL: https://issues.apache.org/jira/browse/HDFS-2303
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build, scripts
Affects Versions: 0.23.0, 0.24.0
Reporter: Roman Shaposhnik
Assignee: Roman Shaposhnik
 Fix For: 0.24.0, 0.23.2

 Attachments: HDFS-2303-2.patch.txt, HDFS-2303-3-trunk.patch, 
 HDFS-2303-4-trunk.patch, HDFS-2303-5-trunk.patch, HDFS-2303.patch.txt


 It would be nice to recompile jsvc as part of the native profile. This has a 
 number of benefits including an ability to re-generate all binary artifacts, 
 etc. Most of all, however, it will provide a way to generate jsvc on Linux 
 distributions that don't have matching libc

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HDFS-3071) haadmin failover command does not provide enough detail for when target NN is not ready to be active

2012-03-09 Thread Todd Lipcon (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned HDFS-3071:
-

Assignee: Todd Lipcon

 haadmin failover command does not provide enough detail for when target NN is 
 not ready to be active
 

 Key: HDFS-3071
 URL: https://issues.apache.org/jira/browse/HDFS-3071
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha
Affects Versions: 0.24.0
Reporter: Philip Zeyliger
Assignee: Todd Lipcon

 When running the failover command, you can get an error message like the 
 following:
 {quote}
 $ hdfs --config $(pwd) haadmin -failover namenode2 namenode1
 Failover failed: xxx.yyy/1.2.3.4:8020 is not ready to become active
 {quote}
 Unfortunately, the error message doesn't describe why that node isn't ready 
 to be active.  In my case, the target namenode's logs don't indicate anything 
 either. It turned out that the issue was Safe mode is ON.Resources are low 
 on NN. Safe mode must be turned off manually., but ideally the user would be 
 told that at the time of the failover.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2976) Remove unnecessary method (tokenRefetchNeeded) in DFSClient

2012-03-09 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226052#comment-13226052
 ] 

Hudson commented on HDFS-2976:
--

Integrated in Hadoop-Hdfs-trunk #979 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/979/])
HDFS-2976 removed the unused imports that were missed in previous commit. 
(Revision 1298508)
HDFS-2976 corrected the previous wrong commit for this issue. (Revision 1298507)
HDFS-2976. Remove unnecessary method (tokenRefetchNeeded) in DFSClient.
(Contributed by Uma Maheswara Rao G) (Revision 1298495)

 Result = SUCCESS
umamahesh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1298508
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java

umamahesh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1298507
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java

umamahesh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1298495
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java


 Remove unnecessary method (tokenRefetchNeeded) in DFSClient
 ---

 Key: HDFS-2976
 URL: https://issues.apache.org/jira/browse/HDFS-2976
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.24.0
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G
Priority: Trivial
 Fix For: 0.24.0

 Attachments: HDFS-2976.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2966) TestNameNodeMetrics tests can fail under load

2012-03-09 Thread Steve Loughran (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HDFS-2966:
-

   Resolution: Fixed
Fix Version/s: (was: 0.23.2)
   Status: Resolved  (was: Patch Available)

fixed in trunk. Not patched 0.23.x as the test is out of sync with other 
changes, and it's not that important.

 TestNameNodeMetrics tests can fail under load
 -

 Key: HDFS-2966
 URL: https://issues.apache.org/jira/browse/HDFS-2966
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 0.24.0
 Environment: OS/X running intellij IDEA, firefox, winxp in a 
 virtualbox.
Reporter: Steve Loughran
Assignee: Steve Loughran
Priority: Minor
 Fix For: 0.24.0

 Attachments: HDFS-2966.patch, HDFS-2966.patch, HDFS-2966.patch, 
 HDFS-2966.patch


 I've managed to recreate HDFS-540 and HDFS-2434 by the simple technique of 
 running the HDFS tests on a desktop with out enough memory for all the 
 programs trying to run. Things got swapped out and the tests failed as the DN 
 heartbeats didn't come in on time.
 the tests both rely on {{waitForDeletion()}} to block the tests until the 
 delete operation has completed, but all it does is sleep for the same number 
 of seconds as there are datanodes. This is too brittle -it may work on a 
 lightly-loaded system, but not on a system under heavy load where it is 
 taking longer to replicate than expect.
 Immediate fix: double, triple, the sleep time?
 Better fix: have the thread block until all the DN heartbeats have finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2966) TestNameNodeMetrics tests can fail under load

2012-03-09 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226061#comment-13226061
 ] 

Hudson commented on HDFS-2966:
--

Integrated in Hadoop-Hdfs-trunk-Commit #1931 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1931/])
HDFS-2966 (Revision 1298820)

 Result = SUCCESS
stevel : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1298820
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/metrics/TestNameNodeMetrics.java


 TestNameNodeMetrics tests can fail under load
 -

 Key: HDFS-2966
 URL: https://issues.apache.org/jira/browse/HDFS-2966
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 0.24.0
 Environment: OS/X running intellij IDEA, firefox, winxp in a 
 virtualbox.
Reporter: Steve Loughran
Assignee: Steve Loughran
Priority: Minor
 Fix For: 0.24.0

 Attachments: HDFS-2966.patch, HDFS-2966.patch, HDFS-2966.patch, 
 HDFS-2966.patch


 I've managed to recreate HDFS-540 and HDFS-2434 by the simple technique of 
 running the HDFS tests on a desktop with out enough memory for all the 
 programs trying to run. Things got swapped out and the tests failed as the DN 
 heartbeats didn't come in on time.
 the tests both rely on {{waitForDeletion()}} to block the tests until the 
 delete operation has completed, but all it does is sleep for the same number 
 of seconds as there are datanodes. This is too brittle -it may work on a 
 lightly-loaded system, but not on a system under heavy load where it is 
 taking longer to replicate than expect.
 Immediate fix: double, triple, the sleep time?
 Better fix: have the thread block until all the DN heartbeats have finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2966) TestNameNodeMetrics tests can fail under load

2012-03-09 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226062#comment-13226062
 ] 

Hudson commented on HDFS-2966:
--

Integrated in Hadoop-Common-trunk-Commit #1856 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1856/])
HDFS-2966 (Revision 1298820)

 Result = SUCCESS
stevel : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1298820
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/metrics/TestNameNodeMetrics.java


 TestNameNodeMetrics tests can fail under load
 -

 Key: HDFS-2966
 URL: https://issues.apache.org/jira/browse/HDFS-2966
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 0.24.0
 Environment: OS/X running intellij IDEA, firefox, winxp in a 
 virtualbox.
Reporter: Steve Loughran
Assignee: Steve Loughran
Priority: Minor
 Fix For: 0.24.0

 Attachments: HDFS-2966.patch, HDFS-2966.patch, HDFS-2966.patch, 
 HDFS-2966.patch


 I've managed to recreate HDFS-540 and HDFS-2434 by the simple technique of 
 running the HDFS tests on a desktop with out enough memory for all the 
 programs trying to run. Things got swapped out and the tests failed as the DN 
 heartbeats didn't come in on time.
 the tests both rely on {{waitForDeletion()}} to block the tests until the 
 delete operation has completed, but all it does is sleep for the same number 
 of seconds as there are datanodes. This is too brittle -it may work on a 
 lightly-loaded system, but not on a system under heavy load where it is 
 taking longer to replicate than expect.
 Immediate fix: double, triple, the sleep time?
 Better fix: have the thread block until all the DN heartbeats have finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2492) BlockManager cross-rack replication checks only work for ScriptBasedMapping

2012-03-09 Thread Steve Loughran (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226063#comment-13226063
 ] 

Steve Loughran commented on HDFS-2492:
--

+1 for this; it's needed to complete the roll out of the (still optional) 
topology base class; all tests are working. 

 BlockManager cross-rack replication checks only work for ScriptBasedMapping
 ---

 Key: HDFS-2492
 URL: https://issues.apache.org/jira/browse/HDFS-2492
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.23.0, 0.24.0
Reporter: Steve Loughran
Assignee: Steve Loughran
Priority: Minor
 Fix For: 0.24.0, 0.23.3

 Attachments: HDFS-2492-blockmanager.patch, 
 HDFS-2492-blockmanager.patch, HDFS-2492-blockmanager.patch, 
 HDFS-2492-blockmanager.patch, HDFS-2492-blockmanager.patch, 
 HDFS-2492-blockmanager.patch


 The BlockManager cross-rack replication checks only works if script files are 
 used for replication, not if alternate plugins provide the topology 
 information.
 This is because the BlockManager sets its rack checking flag if there is a 
 filename key
 {code}
 shouldCheckForEnoughRacks = 
 conf.get(DFSConfigKeys.NET_TOPOLOGY_SCRIPT_FILE_NAME_KEY) != null;
 {code}
 yet this filename key is only used if the topology mapper defined by 
 {code}
 DFSConfigKeys.NET_TOPOLOGY_NODE_SWITCH_MAPPING_IMPL_KEY
 {code}
 is an instance of {{ScriptBasedMapping}}
 If any other mapper is used, the system may be multi rack, but the Block 
 Manager will not be aware of this fact unless the filename key is set to 
 something non-null

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2976) Remove unnecessary method (tokenRefetchNeeded) in DFSClient

2012-03-09 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226078#comment-13226078
 ] 

Hudson commented on HDFS-2976:
--

Integrated in Hadoop-Mapreduce-trunk #1014 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1014/])
HDFS-2976 removed the unused imports that were missed in previous commit. 
(Revision 1298508)
HDFS-2976 corrected the previous wrong commit for this issue. (Revision 1298507)
HDFS-2976. Remove unnecessary method (tokenRefetchNeeded) in DFSClient.
(Contributed by Uma Maheswara Rao G) (Revision 1298495)

 Result = SUCCESS
umamahesh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1298508
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java

umamahesh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1298507
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java

umamahesh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1298495
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java


 Remove unnecessary method (tokenRefetchNeeded) in DFSClient
 ---

 Key: HDFS-2976
 URL: https://issues.apache.org/jira/browse/HDFS-2976
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.24.0
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G
Priority: Trivial
 Fix For: 0.24.0

 Attachments: HDFS-2976.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2966) TestNameNodeMetrics tests can fail under load

2012-03-09 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226082#comment-13226082
 ] 

Hudson commented on HDFS-2966:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #1865 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1865/])
HDFS-2966 (Revision 1298820)

 Result = ABORTED
stevel : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1298820
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/metrics/TestNameNodeMetrics.java


 TestNameNodeMetrics tests can fail under load
 -

 Key: HDFS-2966
 URL: https://issues.apache.org/jira/browse/HDFS-2966
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 0.24.0
 Environment: OS/X running intellij IDEA, firefox, winxp in a 
 virtualbox.
Reporter: Steve Loughran
Assignee: Steve Loughran
Priority: Minor
 Fix For: 0.24.0

 Attachments: HDFS-2966.patch, HDFS-2966.patch, HDFS-2966.patch, 
 HDFS-2966.patch


 I've managed to recreate HDFS-540 and HDFS-2434 by the simple technique of 
 running the HDFS tests on a desktop with out enough memory for all the 
 programs trying to run. Things got swapped out and the tests failed as the DN 
 heartbeats didn't come in on time.
 the tests both rely on {{waitForDeletion()}} to block the tests until the 
 delete operation has completed, but all it does is sleep for the same number 
 of seconds as there are datanodes. This is too brittle -it may work on a 
 lightly-loaded system, but not on a system under heavy load where it is 
 taking longer to replicate than expect.
 Immediate fix: double, triple, the sleep time?
 Better fix: have the thread block until all the DN heartbeats have finished.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3063) NameNode should validate all coming file path

2012-03-09 Thread Daryn Sharp (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226121#comment-13226121
 ] 

Daryn Sharp commented on HDFS-3063:
---

I was posing it as a question.  I'm not a rpc expert, but quickly running 
through the call code does indeed look impossible to hook in w/o artificially 
coupling the rpc layer to the namenode protocol.  If so, definitely scratch 
that idea.  A rpc domain expert might provide guidance.

My only other suggestions would be to consider calling a method that 
encapsulates the if  throw.  If we want to change the exception type or 
message, it's 1 instead of N-many locations to change.  Since we use generic 
IOExceptions everywhere, the client often has to mince the error string which 
makes enforced consistency especially important.

It may make sense to always perform the check as the first statement of the 
methods.  The validity of the path is unrelated to safemode, so should I really 
have to wait for the NN to be operational before knowing my paths are invalid?

 NameNode should validate all coming file path
 -

 Key: HDFS-3063
 URL: https://issues.apache.org/jira/browse/HDFS-3063
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 0.20.205.0
Reporter: Denny Ye
Priority: Minor
  Labels: namenode
 Attachments: HDFS-3063.patch


 NameNode provides RPC service for not only DFS client but also user defined 
 program. A common case we always met is that user transfers file path 
 prefixed with HDFS protocol(hdfs://{namenode:{port}}/{folder}/{file}). 
 NameNode cannot map node meta-data with this path and always throw NPE. In 
 user client, we only see the NullPointerException, no other tips for which 
 step it occurs. 
 Also, NameNode should validate all coming file path with regular format.
 One exception I met:
 Exception in thread main org.apache.hadoop.ipc.RemoteException: 
 java.io.IOException: java.lang.NullPointerException
   at 
 org.apache.hadoop.hdfs.server.namenode.INode.getPathComponents(INode.java:334)
   at 
 org.apache.hadoop.hdfs.server.namenode.INode.getPathComponents(INode.java:329)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN

2012-03-09 Thread Eli Collins (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226273#comment-13226273
 ] 

Eli Collins commented on HDFS-1623:
---

+1 branch 23 patch looks good to me

 High Availability Framework for HDFS NN
 ---

 Key: HDFS-1623
 URL: https://issues.apache.org/jira/browse/HDFS-1623
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Sanjay Radia
 Fix For: 0.24.0

 Attachments: HA-tests.pdf, HDFS-1623.rel23.patch, 
 HDFS-1623.trunk.patch, HDFS-High-Availability.pdf, NameNode HA_v2.pdf, 
 NameNode HA_v2_1.pdf, Namenode HA Framework.pdf, dfsio-results.tsv, 
 ha-testplan.pdf, ha-testplan.tex




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2303) jsvc needs to be recompilable

2012-03-09 Thread Eli Collins (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-2303:
--

   Fix Version/s: (was: 0.23.2)
  (was: 0.24.0)
Target Version/s: 0.23.3
  Status: Patch Available  (was: Open)

 jsvc needs to be recompilable
 -

 Key: HDFS-2303
 URL: https://issues.apache.org/jira/browse/HDFS-2303
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build, scripts
Affects Versions: 0.23.0, 0.24.0
Reporter: Roman Shaposhnik
Assignee: Roman Shaposhnik
 Attachments: HDFS-2303-2.patch.txt, HDFS-2303-3-trunk.patch, 
 HDFS-2303-4-trunk.patch, HDFS-2303-5-trunk.patch, HDFS-2303.patch.txt


 It would be nice to recompile jsvc as part of the native profile. This has a 
 number of benefits including an ability to re-generate all binary artifacts, 
 etc. Most of all, however, it will provide a way to generate jsvc on Linux 
 distributions that don't have matching libc

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3066) cap space usage of default log4j rolling policy (hdfs specific changes)

2012-03-09 Thread Eli Collins (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-3066:
--

Target Version/s: 0.23.3

 cap space usage of default log4j rolling policy (hdfs specific changes)
 ---

 Key: HDFS-3066
 URL: https://issues.apache.org/jira/browse/HDFS-3066
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: scripts
Reporter: Patrick Hunt
Assignee: Patrick Hunt
 Attachments: HDFS-3066.patch


 see HADOOP-8149 for background on this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3004) Implement Recovery Mode

2012-03-09 Thread Colin Patrick McCabe (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3004:
---

Attachment: (was: HDFS-3004.007.patch)

 Implement Recovery Mode
 ---

 Key: HDFS-3004
 URL: https://issues.apache.org/jira/browse/HDFS-3004
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: tools
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-3004.008.patch, 
 HDFS-3004__namenode_recovery_tool.txt


 When the NameNode metadata is corrupt for some reason, we want to be able to 
 fix it.  Obviously, we would prefer never to get in this case.  In a perfect 
 world, we never would.  However, bad data on disk can happen from time to 
 time, because of hardware errors or misconfigurations.  In the past we have 
 had to correct it manually, which is time-consuming and which can result in 
 downtime.
 Recovery mode is initialized by the system administrator.  When the NameNode 
 starts up in Recovery Mode, it will try to load the FSImage file, apply all 
 the edits from the edits log, and then write out a new image.  Then it will 
 shut down.
 Unlike in the normal startup process, the recovery mode startup process will 
 be interactive.  When the NameNode finds something that is inconsistent, it 
 will prompt the operator as to what it should do.   The operator can also 
 choose to take the first option for all prompts by starting up with the '-f' 
 flag, or typing 'a' at one of the prompts.
 I have reused as much code as possible from the NameNode in this tool.  
 Hopefully, the effort that was spent developing this will also make the 
 NameNode editLog and image processing even more robust than it already is.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3004) Implement Recovery Mode

2012-03-09 Thread Colin Patrick McCabe (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3004:
---

Attachment: HDFS-3004.008.patch

regenerate patch with diff --no-prefix (d'oh!)

 Implement Recovery Mode
 ---

 Key: HDFS-3004
 URL: https://issues.apache.org/jira/browse/HDFS-3004
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: tools
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-3004.008.patch, 
 HDFS-3004__namenode_recovery_tool.txt


 When the NameNode metadata is corrupt for some reason, we want to be able to 
 fix it.  Obviously, we would prefer never to get in this case.  In a perfect 
 world, we never would.  However, bad data on disk can happen from time to 
 time, because of hardware errors or misconfigurations.  In the past we have 
 had to correct it manually, which is time-consuming and which can result in 
 downtime.
 Recovery mode is initialized by the system administrator.  When the NameNode 
 starts up in Recovery Mode, it will try to load the FSImage file, apply all 
 the edits from the edits log, and then write out a new image.  Then it will 
 shut down.
 Unlike in the normal startup process, the recovery mode startup process will 
 be interactive.  When the NameNode finds something that is inconsistent, it 
 will prompt the operator as to what it should do.   The operator can also 
 choose to take the first option for all prompts by starting up with the '-f' 
 flag, or typing 'a' at one of the prompts.
 I have reused as much code as possible from the NameNode in this tool.  
 Hopefully, the effort that was spent developing this will also make the 
 NameNode editLog and image processing even more robust than it already is.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3073) NetworkTopology::getLeaf should check for invalid topologies

2012-03-09 Thread Colin Patrick McCabe (Created) (JIRA)
NetworkTopology::getLeaf should check for invalid topologies


 Key: HDFS-3073
 URL: https://issues.apache.org/jira/browse/HDFS-3073
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe


Currently, NetworkTopology::getLeaf doesn't do too much validation on the 
NetworkTopology object itself.  This results in us getting ClassCastException 
sometimes when the topology is invalid.  We should have a less confusing 
exception message for this case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2303) jsvc needs to be recompilable

2012-03-09 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226314#comment-13226314
 ] 

Hadoop QA commented on HDFS-2303:
-

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12517689/HDFS-2303-5-trunk.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1977//console

This message is automatically generated.

 jsvc needs to be recompilable
 -

 Key: HDFS-2303
 URL: https://issues.apache.org/jira/browse/HDFS-2303
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build, scripts
Affects Versions: 0.23.0, 0.24.0
Reporter: Roman Shaposhnik
Assignee: Roman Shaposhnik
 Attachments: HDFS-2303-2.patch.txt, HDFS-2303-3-trunk.patch, 
 HDFS-2303-4-trunk.patch, HDFS-2303-5-trunk.patch, HDFS-2303.patch.txt


 It would be nice to recompile jsvc as part of the native profile. This has a 
 number of benefits including an ability to re-generate all binary artifacts, 
 etc. Most of all, however, it will provide a way to generate jsvc on Linux 
 distributions that don't have matching libc

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3004) Implement Recovery Mode

2012-03-09 Thread Eli Collins (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226378#comment-13226378
 ] 

Eli Collins commented on HDFS-3004:
---

Your comments above make sense, thanks for the explanation.

Comments on latest patch:
- HDFS-2709 (hash 110b6d0) introduced EditLogInputException and used to have 
places where it was caught explicitly, that they just catch IOE, so given that 
you we no longer throw this either you can remove the class entirely
- In logTruncateMessage we should log something like stopping edit log load at 
position X instead of saying we're truncating it because we're not actually 
truncating the log (from the user's perspective)
- Isn't always select the first choice effectively always skip? Better to 
call it that as users might think it means use the previously selected option 
for all future choices (eg if I chose skip then chose try to fix then 
always choose 1st I might not have meant to always skip).
- The conditional on answer is probably more readable as a switch, wasn't 
clear that the else clause was always a and therefore that's why we call 
recovery.setAlwaysChooseFirst()
- What is the TODO: attempt to resynchronize stream here for?
- Should use s.equals(answer) instead of answer == s etc since if for some 
reason RecoveryContext doesn't return the exact object it was passed in the 
future this would break
- Should RC#ask should log as info instead of error for prompt and 
automatically choosing log
- RC#ask javadoc needs to be updated to match the method. Also, his choice - 
their choice =P
- RecoveryContext could use a high-level javadoc with a sentence or two since 
the name is pretty generic and the use is very specific
- Can s/LOG.error/LOG.fatal/ in NN.java for recovery failed case
- NN#printUsage has two IMPORT lines
- ++i still used in a couple files
- brackets on their own line still need fixing eg } else if {
- Why does TestRecoverTruncatedEditLog make the same dir 21 times? Maybe you 
mean to append i to the path? The test should corrupt an operation that 
mutates the namespace (vs the last op which I believe is an op to finalize the 
log segment) so you can test that that edit is not present when you reload (eg 
corrupt the edit to mkdir /foo then assert /foo does not exist in the namespace)


 Implement Recovery Mode
 ---

 Key: HDFS-3004
 URL: https://issues.apache.org/jira/browse/HDFS-3004
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: tools
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-3004.008.patch, 
 HDFS-3004__namenode_recovery_tool.txt


 When the NameNode metadata is corrupt for some reason, we want to be able to 
 fix it.  Obviously, we would prefer never to get in this case.  In a perfect 
 world, we never would.  However, bad data on disk can happen from time to 
 time, because of hardware errors or misconfigurations.  In the past we have 
 had to correct it manually, which is time-consuming and which can result in 
 downtime.
 Recovery mode is initialized by the system administrator.  When the NameNode 
 starts up in Recovery Mode, it will try to load the FSImage file, apply all 
 the edits from the edits log, and then write out a new image.  Then it will 
 shut down.
 Unlike in the normal startup process, the recovery mode startup process will 
 be interactive.  When the NameNode finds something that is inconsistent, it 
 will prompt the operator as to what it should do.   The operator can also 
 choose to take the first option for all prompts by starting up with the '-f' 
 flag, or typing 'a' at one of the prompts.
 I have reused as much code as possible from the NameNode in this tool.  
 Hopefully, the effort that was spent developing this will also make the 
 NameNode editLog and image processing even more robust than it already is.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2303) jsvc needs to be recompilable

2012-03-09 Thread Eli Collins (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-2303:
--

Attachment: HDFS-2303-5-modcommon-trunk.patch

Same patch modulo the commented line in hadoop-env.sh in hadoop-common so 
jenkins will run.

 jsvc needs to be recompilable
 -

 Key: HDFS-2303
 URL: https://issues.apache.org/jira/browse/HDFS-2303
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build, scripts
Affects Versions: 0.23.0, 0.24.0
Reporter: Roman Shaposhnik
Assignee: Roman Shaposhnik
 Attachments: HDFS-2303-2.patch.txt, HDFS-2303-3-trunk.patch, 
 HDFS-2303-4-trunk.patch, HDFS-2303-5-modcommon-trunk.patch, 
 HDFS-2303-5-trunk.patch, HDFS-2303.patch.txt


 It would be nice to recompile jsvc as part of the native profile. This has a 
 number of benefits including an ability to re-generate all binary artifacts, 
 etc. Most of all, however, it will provide a way to generate jsvc on Linux 
 distributions that don't have matching libc

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2303) jsvc needs to be recompilable

2012-03-09 Thread Eli Collins (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226395#comment-13226395
 ] 

Eli Collins commented on HDFS-2303:
---

+1  Latest patch (HDFS-2303-5-trunk.patch) looks great.  Thanks Mingjie!

 jsvc needs to be recompilable
 -

 Key: HDFS-2303
 URL: https://issues.apache.org/jira/browse/HDFS-2303
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build, scripts
Affects Versions: 0.23.0, 0.24.0
Reporter: Roman Shaposhnik
Assignee: Roman Shaposhnik
 Attachments: HDFS-2303-2.patch.txt, HDFS-2303-3-trunk.patch, 
 HDFS-2303-4-trunk.patch, HDFS-2303-5-modcommon-trunk.patch, 
 HDFS-2303-5-trunk.patch, HDFS-2303.patch.txt


 It would be nice to recompile jsvc as part of the native profile. This has a 
 number of benefits including an ability to re-generate all binary artifacts, 
 etc. Most of all, however, it will provide a way to generate jsvc on Linux 
 distributions that don't have matching libc

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3066) cap space usage of default log4j rolling policy (hdfs specific changes)

2012-03-09 Thread Eli Collins (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226399#comment-13226399
 ] 

Eli Collins commented on HDFS-3066:
---

+1 pending jenkins

 cap space usage of default log4j rolling policy (hdfs specific changes)
 ---

 Key: HDFS-3066
 URL: https://issues.apache.org/jira/browse/HDFS-3066
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: scripts
Reporter: Patrick Hunt
Assignee: Patrick Hunt
 Attachments: HDFS-3066.patch


 see HADOOP-8149 for background on this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2303) jsvc needs to be recompilable

2012-03-09 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226402#comment-13226402
 ] 

Hadoop QA commented on HDFS-2303:
-

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12517770/HDFS-2303-5-modcommon-trunk.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1978//console

This message is automatically generated.

 jsvc needs to be recompilable
 -

 Key: HDFS-2303
 URL: https://issues.apache.org/jira/browse/HDFS-2303
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build, scripts
Affects Versions: 0.23.0, 0.24.0
Reporter: Roman Shaposhnik
Assignee: Roman Shaposhnik
 Attachments: HDFS-2303-2.patch.txt, HDFS-2303-3-trunk.patch, 
 HDFS-2303-4-trunk.patch, HDFS-2303-5-modcommon-trunk.patch, 
 HDFS-2303-5-trunk.patch, HDFS-2303.patch.txt


 It would be nice to recompile jsvc as part of the native profile. This has a 
 number of benefits including an ability to re-generate all binary artifacts, 
 etc. Most of all, however, it will provide a way to generate jsvc on Linux 
 distributions that don't have matching libc

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2303) jsvc needs to be recompilable

2012-03-09 Thread Eli Collins (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-2303:
--

Attachment: HDFS-2303-5-modcommon-trunk.patch

Right patch this time.

 jsvc needs to be recompilable
 -

 Key: HDFS-2303
 URL: https://issues.apache.org/jira/browse/HDFS-2303
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build, scripts
Affects Versions: 0.23.0, 0.24.0
Reporter: Roman Shaposhnik
Assignee: Roman Shaposhnik
 Attachments: HDFS-2303-2.patch.txt, HDFS-2303-3-trunk.patch, 
 HDFS-2303-4-trunk.patch, HDFS-2303-5-modcommon-trunk.patch, 
 HDFS-2303-5-modcommon-trunk.patch, HDFS-2303-5-trunk.patch, 
 HDFS-2303.patch.txt


 It would be nice to recompile jsvc as part of the native profile. This has a 
 number of benefits including an ability to re-generate all binary artifacts, 
 etc. Most of all, however, it will provide a way to generate jsvc on Linux 
 distributions that don't have matching libc

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3073) NetworkTopology::getLeaf should check for invalid topologies

2012-03-09 Thread Aaron T. Myers (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226445#comment-13226445
 ] 

Aaron T. Myers commented on HDFS-3073:
--

Seems like this JIRA should perhaps be moved to Common?

 NetworkTopology::getLeaf should check for invalid topologies
 

 Key: HDFS-3073
 URL: https://issues.apache.org/jira/browse/HDFS-3073
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe

 Currently, NetworkTopology::getLeaf doesn't do too much validation on the 
 NetworkTopology object itself.  This results in us getting ClassCastException 
 sometimes when the topology is invalid.  We should have a less confusing 
 exception message for this case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3066) cap space usage of default log4j rolling policy (hdfs specific changes)

2012-03-09 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226479#comment-13226479
 ] 

Hadoop QA commented on HDFS-3066:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12517634/HDFS-3066.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/1979//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1979//console

This message is automatically generated.

 cap space usage of default log4j rolling policy (hdfs specific changes)
 ---

 Key: HDFS-3066
 URL: https://issues.apache.org/jira/browse/HDFS-3066
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: scripts
Reporter: Patrick Hunt
Assignee: Patrick Hunt
 Attachments: HDFS-3066.patch


 see HADOOP-8149 for background on this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3004) Implement Recovery Mode

2012-03-09 Thread Colin Patrick McCabe (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226490#comment-13226490
 ] 

Colin Patrick McCabe commented on HDFS-3004:


bq. Isn't always select the first choice effectively always skip? Better to 
call it that as users might think it means use the previously selected option 
for all future choices (eg if I chose skip then chose try to fix then 
always choose 1st I might not have meant to always skip).

The first choice isn't always skip-- sometimes it's truncate.

Agree with the rest of the points

 Implement Recovery Mode
 ---

 Key: HDFS-3004
 URL: https://issues.apache.org/jira/browse/HDFS-3004
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: tools
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-3004.008.patch, 
 HDFS-3004__namenode_recovery_tool.txt


 When the NameNode metadata is corrupt for some reason, we want to be able to 
 fix it.  Obviously, we would prefer never to get in this case.  In a perfect 
 world, we never would.  However, bad data on disk can happen from time to 
 time, because of hardware errors or misconfigurations.  In the past we have 
 had to correct it manually, which is time-consuming and which can result in 
 downtime.
 Recovery mode is initialized by the system administrator.  When the NameNode 
 starts up in Recovery Mode, it will try to load the FSImage file, apply all 
 the edits from the edits log, and then write out a new image.  Then it will 
 shut down.
 Unlike in the normal startup process, the recovery mode startup process will 
 be interactive.  When the NameNode finds something that is inconsistent, it 
 will prompt the operator as to what it should do.   The operator can also 
 choose to take the first option for all prompts by starting up with the '-f' 
 flag, or typing 'a' at one of the prompts.
 I have reused as much code as possible from the NameNode in this tool.  
 Hopefully, the effort that was spent developing this will also make the 
 NameNode editLog and image processing even more robust than it already is.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3073) NetworkTopology::getLeaf should check for invalid topologies

2012-03-09 Thread Colin Patrick McCabe (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3073:
---

Attachment: HDFS-3073.002.patch

* refactor getLeaf a bit

* the exception getLeaf() throws for an invalid Node now includes the offending 
Node as a string, and a helpful error message.

 NetworkTopology::getLeaf should check for invalid topologies
 

 Key: HDFS-3073
 URL: https://issues.apache.org/jira/browse/HDFS-3073
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-3073.002.patch


 Currently, NetworkTopology::getLeaf doesn't do too much validation on the 
 NetworkTopology object itself.  This results in us getting ClassCastException 
 sometimes when the topology is invalid.  We should have a less confusing 
 exception message for this case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3004) Implement Recovery Mode

2012-03-09 Thread Colin Patrick McCabe (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226530#comment-13226530
 ] 

Colin Patrick McCabe commented on HDFS-3004:


cannot switch on a value of type String for source level below 1.7

Nice idea, but it looks like it's going to have to be if statements.

 Implement Recovery Mode
 ---

 Key: HDFS-3004
 URL: https://issues.apache.org/jira/browse/HDFS-3004
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: tools
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-3004.008.patch, 
 HDFS-3004__namenode_recovery_tool.txt


 When the NameNode metadata is corrupt for some reason, we want to be able to 
 fix it.  Obviously, we would prefer never to get in this case.  In a perfect 
 world, we never would.  However, bad data on disk can happen from time to 
 time, because of hardware errors or misconfigurations.  In the past we have 
 had to correct it manually, which is time-consuming and which can result in 
 downtime.
 Recovery mode is initialized by the system administrator.  When the NameNode 
 starts up in Recovery Mode, it will try to load the FSImage file, apply all 
 the edits from the edits log, and then write out a new image.  Then it will 
 shut down.
 Unlike in the normal startup process, the recovery mode startup process will 
 be interactive.  When the NameNode finds something that is inconsistent, it 
 will prompt the operator as to what it should do.   The operator can also 
 choose to take the first option for all prompts by starting up with the '-f' 
 flag, or typing 'a' at one of the prompts.
 I have reused as much code as possible from the NameNode in this tool.  
 Hopefully, the effort that was spent developing this will also make the 
 NameNode editLog and image processing even more robust than it already is.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3070) hdfs balancer doesn't balance blocks between datanodes

2012-03-09 Thread Eli Collins (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-3070:
--

Target Version/s: 0.23.3

 hdfs balancer doesn't balance blocks between datanodes
 --

 Key: HDFS-3070
 URL: https://issues.apache.org/jira/browse/HDFS-3070
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer
Affects Versions: 0.24.0
Reporter: Stephen Chu
 Attachments: unbalanced_nodes.png, unbalanced_nodes_inservice.png


 I TeraGenerated data into DataNodes styx01 and styx02. Looking at the web UI, 
 both have over 3% disk usage.
 Attached is a screenshot of the Live Nodes web UI.
 On styx01, I run the _hdfs balancer_ command with threshold 1% and don't see 
 the blocks being balanced across all 4 datanodes (all blocks on styx01 and 
 styx02 stay put).
 HA is currently enabled.
 [schu@styx01 ~]$ hdfs haadmin -getServiceState nn1
 active
 [schu@styx01 ~]$ hdfs balancer -threshold 1
 12/03/08 10:10:32 INFO balancer.Balancer: Using a threshold of 1.0
 12/03/08 10:10:32 INFO balancer.Balancer: namenodes = []
 12/03/08 10:10:32 INFO balancer.Balancer: p = 
 Balancer.Parameters[BalancingPolicy.Node, threshold=1.0]
 Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
 Bytes Being Moved
 Balancing took 95.0 milliseconds
 [schu@styx01 ~]$ 
 I believe with a threshold of 1% the balancer should trigger blocks being 
 moved across DataNodes, right? I am curious about the namenode = [] from 
 the above output.
 [schu@styx01 ~]$ hadoop version
 Hadoop 0.24.0-SNAPSHOT
 Subversion 
 git://styx01.sf.cloudera.com/home/schu/hadoop-common/hadoop-common-project/hadoop-common
  -r f6a577d697bbcd04ffbc568167c97b79479ff319
 Compiled by schu on Thu Mar  8 15:32:50 PST 2012
 From source with checksum ec971a6e7316f7fbf471b617905856b8
 From 
 http://hadoop.apache.org/hdfs/docs/r0.21.0/api/org/apache/hadoop/hdfs/server/balancer/Balancer.html:
 The threshold parameter is a fraction in the range of (0%, 100%) with a 
 default value of 10%. The threshold sets a target for whether the cluster is 
 balanced. A cluster is balanced if for each datanode, the utilization of the 
 node (ratio of used space at the node to total capacity of the node) differs 
 from the utilization of the (ratio of used space in the cluster to total 
 capacity of the cluster) by no more than the threshold value. The smaller the 
 threshold, the more balanced a cluster will become. It takes more time to run 
 the balancer for small threshold values. Also for a very small threshold the 
 cluster may not be able to reach the balanced state when applications write 
 and delete files concurrently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2303) jsvc needs to be recompilable

2012-03-09 Thread Eli Collins (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-2303:
--

Status: Open  (was: Patch Available)

 jsvc needs to be recompilable
 -

 Key: HDFS-2303
 URL: https://issues.apache.org/jira/browse/HDFS-2303
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build, scripts
Affects Versions: 0.23.0, 0.24.0
Reporter: Roman Shaposhnik
Assignee: Roman Shaposhnik
 Attachments: HDFS-2303-2.patch.txt, HDFS-2303-3-trunk.patch, 
 HDFS-2303-4-trunk.patch, HDFS-2303-5-modcommon-trunk.patch, 
 HDFS-2303-5-modcommon-trunk.patch, HDFS-2303-5-trunk.patch, 
 HDFS-2303.patch.txt


 It would be nice to recompile jsvc as part of the native profile. This has a 
 number of benefits including an ability to re-generate all binary artifacts, 
 etc. Most of all, however, it will provide a way to generate jsvc on Linux 
 distributions that don't have matching libc

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2303) jsvc needs to be recompilable

2012-03-09 Thread Eli Collins (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-2303:
--

Status: Patch Available  (was: Open)

 jsvc needs to be recompilable
 -

 Key: HDFS-2303
 URL: https://issues.apache.org/jira/browse/HDFS-2303
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: build, scripts
Affects Versions: 0.23.0, 0.24.0
Reporter: Roman Shaposhnik
Assignee: Roman Shaposhnik
 Attachments: HDFS-2303-2.patch.txt, HDFS-2303-3-trunk.patch, 
 HDFS-2303-4-trunk.patch, HDFS-2303-5-modcommon-trunk.patch, 
 HDFS-2303-5-modcommon-trunk.patch, HDFS-2303-5-trunk.patch, 
 HDFS-2303.patch.txt


 It would be nice to recompile jsvc as part of the native profile. This has a 
 number of benefits including an ability to re-generate all binary artifacts, 
 etc. Most of all, however, it will provide a way to generate jsvc on Linux 
 distributions that don't have matching libc

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3004) Implement Recovery Mode

2012-03-09 Thread Eli Collins (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226583#comment-13226583
 ] 

Eli Collins commented on HDFS-3004:
---

HDFS-3004.008.patch has a bunch of other stuff in it, probably not the patch 
you intended. 

 Implement Recovery Mode
 ---

 Key: HDFS-3004
 URL: https://issues.apache.org/jira/browse/HDFS-3004
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: tools
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-3004.008.patch, 
 HDFS-3004__namenode_recovery_tool.txt


 When the NameNode metadata is corrupt for some reason, we want to be able to 
 fix it.  Obviously, we would prefer never to get in this case.  In a perfect 
 world, we never would.  However, bad data on disk can happen from time to 
 time, because of hardware errors or misconfigurations.  In the past we have 
 had to correct it manually, which is time-consuming and which can result in 
 downtime.
 Recovery mode is initialized by the system administrator.  When the NameNode 
 starts up in Recovery Mode, it will try to load the FSImage file, apply all 
 the edits from the edits log, and then write out a new image.  Then it will 
 shut down.
 Unlike in the normal startup process, the recovery mode startup process will 
 be interactive.  When the NameNode finds something that is inconsistent, it 
 will prompt the operator as to what it should do.   The operator can also 
 choose to take the first option for all prompts by starting up with the '-f' 
 flag, or typing 'a' at one of the prompts.
 I have reused as much code as possible from the NameNode in this tool.  
 Hopefully, the effort that was spent developing this will also make the 
 NameNode editLog and image processing even more robust than it already is.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3070) hdfs balancer doesn't balance blocks between datanodes

2012-03-09 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226590#comment-13226590
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-3070:
--

 12/03/08 10:10:32 INFO balancer.Balancer: namenodes = []

The namenode lists is empty.  You have to set dfs.namenode.servicerpc-address.

 hdfs balancer doesn't balance blocks between datanodes
 --

 Key: HDFS-3070
 URL: https://issues.apache.org/jira/browse/HDFS-3070
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer
Affects Versions: 0.24.0
Reporter: Stephen Chu
 Attachments: unbalanced_nodes.png, unbalanced_nodes_inservice.png


 I TeraGenerated data into DataNodes styx01 and styx02. Looking at the web UI, 
 both have over 3% disk usage.
 Attached is a screenshot of the Live Nodes web UI.
 On styx01, I run the _hdfs balancer_ command with threshold 1% and don't see 
 the blocks being balanced across all 4 datanodes (all blocks on styx01 and 
 styx02 stay put).
 HA is currently enabled.
 [schu@styx01 ~]$ hdfs haadmin -getServiceState nn1
 active
 [schu@styx01 ~]$ hdfs balancer -threshold 1
 12/03/08 10:10:32 INFO balancer.Balancer: Using a threshold of 1.0
 12/03/08 10:10:32 INFO balancer.Balancer: namenodes = []
 12/03/08 10:10:32 INFO balancer.Balancer: p = 
 Balancer.Parameters[BalancingPolicy.Node, threshold=1.0]
 Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
 Bytes Being Moved
 Balancing took 95.0 milliseconds
 [schu@styx01 ~]$ 
 I believe with a threshold of 1% the balancer should trigger blocks being 
 moved across DataNodes, right? I am curious about the namenode = [] from 
 the above output.
 [schu@styx01 ~]$ hadoop version
 Hadoop 0.24.0-SNAPSHOT
 Subversion 
 git://styx01.sf.cloudera.com/home/schu/hadoop-common/hadoop-common-project/hadoop-common
  -r f6a577d697bbcd04ffbc568167c97b79479ff319
 Compiled by schu on Thu Mar  8 15:32:50 PST 2012
 From source with checksum ec971a6e7316f7fbf471b617905856b8
 From 
 http://hadoop.apache.org/hdfs/docs/r0.21.0/api/org/apache/hadoop/hdfs/server/balancer/Balancer.html:
 The threshold parameter is a fraction in the range of (0%, 100%) with a 
 default value of 10%. The threshold sets a target for whether the cluster is 
 balanced. A cluster is balanced if for each datanode, the utilization of the 
 node (ratio of used space at the node to total capacity of the node) differs 
 from the utilization of the (ratio of used space in the cluster to total 
 capacity of the cluster) by no more than the threshold value. The smaller the 
 threshold, the more balanced a cluster will become. It takes more time to run 
 the balancer for small threshold values. Also for a very small threshold the 
 cluster may not be able to reach the balanced state when applications write 
 and delete files concurrently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3070) hdfs balancer doesn't balance blocks between datanodes

2012-03-09 Thread Eli Collins (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226599#comment-13226599
 ] 

Eli Collins commented on HDFS-3070:
---

Stephen,
What are dfs.namenode.rpc-address and servicerpc-address set to in the configs?

I suspect at least the 1st is set so it might be a bug in the method the 
balancer uses to determine the namenodes (eg doesn't work for a federated or HA 
conf).

 hdfs balancer doesn't balance blocks between datanodes
 --

 Key: HDFS-3070
 URL: https://issues.apache.org/jira/browse/HDFS-3070
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer
Affects Versions: 0.24.0
Reporter: Stephen Chu
 Attachments: unbalanced_nodes.png, unbalanced_nodes_inservice.png


 I TeraGenerated data into DataNodes styx01 and styx02. Looking at the web UI, 
 both have over 3% disk usage.
 Attached is a screenshot of the Live Nodes web UI.
 On styx01, I run the _hdfs balancer_ command with threshold 1% and don't see 
 the blocks being balanced across all 4 datanodes (all blocks on styx01 and 
 styx02 stay put).
 HA is currently enabled.
 [schu@styx01 ~]$ hdfs haadmin -getServiceState nn1
 active
 [schu@styx01 ~]$ hdfs balancer -threshold 1
 12/03/08 10:10:32 INFO balancer.Balancer: Using a threshold of 1.0
 12/03/08 10:10:32 INFO balancer.Balancer: namenodes = []
 12/03/08 10:10:32 INFO balancer.Balancer: p = 
 Balancer.Parameters[BalancingPolicy.Node, threshold=1.0]
 Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
 Bytes Being Moved
 Balancing took 95.0 milliseconds
 [schu@styx01 ~]$ 
 I believe with a threshold of 1% the balancer should trigger blocks being 
 moved across DataNodes, right? I am curious about the namenode = [] from 
 the above output.
 [schu@styx01 ~]$ hadoop version
 Hadoop 0.24.0-SNAPSHOT
 Subversion 
 git://styx01.sf.cloudera.com/home/schu/hadoop-common/hadoop-common-project/hadoop-common
  -r f6a577d697bbcd04ffbc568167c97b79479ff319
 Compiled by schu on Thu Mar  8 15:32:50 PST 2012
 From source with checksum ec971a6e7316f7fbf471b617905856b8
 From 
 http://hadoop.apache.org/hdfs/docs/r0.21.0/api/org/apache/hadoop/hdfs/server/balancer/Balancer.html:
 The threshold parameter is a fraction in the range of (0%, 100%) with a 
 default value of 10%. The threshold sets a target for whether the cluster is 
 balanced. A cluster is balanced if for each datanode, the utilization of the 
 node (ratio of used space at the node to total capacity of the node) differs 
 from the utilization of the (ratio of used space in the cluster to total 
 capacity of the cluster) by no more than the threshold value. The smaller the 
 threshold, the more balanced a cluster will become. It takes more time to run 
 the balancer for small threshold values. Also for a very small threshold the 
 cluster may not be able to reach the balanced state when applications write 
 and delete files concurrently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3050) refactor OEV to share more code with the NameNode

2012-03-09 Thread Eli Collins (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226624#comment-13226624
 ] 

Eli Collins commented on HDFS-3050:
---

Hey Colin,

Agree that #4 is good approach, can punt on #2 for now.

Overall your patch looks great, minor stuff:
- PermissionStatus, DelegationKey, Block, DelegationTokenIdentifier diffs are 
just unused imports
- addSaxString and OfflineEditsViewer#go could use a small javadoc each 
- Brackets go on the same line as clauses (can update your editor to do this, 
start with the Java conventions and update to no tabs and two space indent)

Thanks,
Eli


 refactor OEV to share more code with the NameNode
 -

 Key: HDFS-3050
 URL: https://issues.apache.org/jira/browse/HDFS-3050
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-3050.004.patch


 Current, OEV (the offline edits viewer) re-implements all of the opcode 
 parsing logic found in the NameNode.  This duplicated code creates a 
 maintenance burden for us.
 OEV should be refactored to simply use the normal EditLog parsing code, 
 rather than rolling its own.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3044) fsck move should be non-destructive by default

2012-03-09 Thread Eli Collins (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226646#comment-13226646
 ] 

Eli Collins commented on HDFS-3044:
---

- The new boolean destructive is unused
- FsckOperation is kind of overkill, probably simpler to have two bools since 
these are independent operations:
-- salvageCorruptFiles, whehter to copy whatever blocks are left to lost+found
-- deleteCorruptFiles, whether to delete corrupt files
- Let's rename lostFoundMove to something like copyBlocksToLostFound to reflect 
what this method actually does, ditto update the warning since we didn't really 
copy the file (perhaps coppied accessible blocks for file X)
- Let's rename testFsckMove to testFsckMoveAndDelete and add a testFsckMove 
that tests that fsck move is not destructive
- Per the last bullet in the description would be good to at least add a log at 
INFO level indicating the # of datanodes that have checked in so an admin can 
see if the number looks off (and doesn't do a destructive operation before 
waiting for DNs to check in)


 fsck move should be non-destructive by default
 --

 Key: HDFS-3044
 URL: https://issues.apache.org/jira/browse/HDFS-3044
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Reporter: Eli Collins
Assignee: Colin Patrick McCabe
 Attachments: HDFS-3044.001.patch


 The fsck move behavior in the code and originally articulated in HADOOP-101 
 is:
 {quote}Current failure modes for DFS involve blocks that are completely 
 missing. The only way to fix them would be to recover chains of blocks and 
 put them into lost+found{quote}
 A directory is created with the file name, the blocks that are accessible are 
 created as individual files in this directory, then the original file is 
 removed. 
 I suspect the rationale for this behavior was that you can't use files that 
 are missing locations, and copying the block as files at least makes part of 
 the files accessible. However this behavior can also result in permanent 
 dataloss. Eg:
 - Some datanodes don't come up (eg due to a HW issues) and checkin on cluster 
 startup, files with blocks where all replicas are on these set of datanodes 
 are marked corrupt
 - Admin does fsck move, which deletes the corrupt files, saves whatever 
 blocks were available
 - The HW issues with datanodes are resolved, they are started and join the 
 cluster. The NN tells them to delete their blocks for the corrupt files since 
 the file was deleted. 
 I think we should:
 - Make fsck move non-destructive by default (eg just does a move into 
 lost+found)
 - Make the destructive behavior optional (eg --destructive so admins think 
 about what they're doing)
 - Provide better sanity checks and warnings, eg if you're running fsck and 
 not all the slaves have checked in (if using dfs.hosts) then fsck should 
 print a warning indicating this that an admin should have to override if they 
 want to do something destructive

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3045) fsck move should bail on a file if it can't create a block file

2012-03-09 Thread Eli Collins (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226655#comment-13226655
 ] 

Eli Collins commented on HDFS-3045:
---

Looks good. This doesn't unwind the directory creation in lost+found but on 2nd 
thought I think that's better (might as well salvage what blocks we can).

Nits:
- Better if the IO references the file it failed to create, eg 
{code}
throw new IOException(errmsg + : could not create  + target + / + chain);
{code}
- Not your change but lets add brackets to the if where the new IOE is thrown


 fsck move should bail on a file if it can't create a block file
 ---

 Key: HDFS-3045
 URL: https://issues.apache.org/jira/browse/HDFS-3045
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Reporter: Eli Collins
Assignee: Colin Patrick McCabe
 Attachments: HDFS-3045.001.patch


 NamenodeFsck#lostFoundMove, when it fails to create a file for a block 
 continues on to the next block (There's a comment perhaps we should bail out 
 here... but it doesn't). It should instead fail the move for that particular 
 file (unwind the directory creation and not delete the original file). 
 Otherwise a transient failure speaking to the NN means this block is lost 
 forever.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3004) Implement Recovery Mode

2012-03-09 Thread Colin Patrick McCabe (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3004:
---

Attachment: HDFS-3004.009.patch

diff against correct change

 Implement Recovery Mode
 ---

 Key: HDFS-3004
 URL: https://issues.apache.org/jira/browse/HDFS-3004
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: tools
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-3004.009.patch, 
 HDFS-3004__namenode_recovery_tool.txt


 When the NameNode metadata is corrupt for some reason, we want to be able to 
 fix it.  Obviously, we would prefer never to get in this case.  In a perfect 
 world, we never would.  However, bad data on disk can happen from time to 
 time, because of hardware errors or misconfigurations.  In the past we have 
 had to correct it manually, which is time-consuming and which can result in 
 downtime.
 Recovery mode is initialized by the system administrator.  When the NameNode 
 starts up in Recovery Mode, it will try to load the FSImage file, apply all 
 the edits from the edits log, and then write out a new image.  Then it will 
 shut down.
 Unlike in the normal startup process, the recovery mode startup process will 
 be interactive.  When the NameNode finds something that is inconsistent, it 
 will prompt the operator as to what it should do.   The operator can also 
 choose to take the first option for all prompts by starting up with the '-f' 
 flag, or typing 'a' at one of the prompts.
 I have reused as much code as possible from the NameNode in this tool.  
 Hopefully, the effort that was spent developing this will also make the 
 NameNode editLog and image processing even more robust than it already is.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3004) Implement Recovery Mode

2012-03-09 Thread Colin Patrick McCabe (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3004:
---

Attachment: (was: HDFS-3004.008.patch)

 Implement Recovery Mode
 ---

 Key: HDFS-3004
 URL: https://issues.apache.org/jira/browse/HDFS-3004
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: tools
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-3004.009.patch, 
 HDFS-3004__namenode_recovery_tool.txt


 When the NameNode metadata is corrupt for some reason, we want to be able to 
 fix it.  Obviously, we would prefer never to get in this case.  In a perfect 
 world, we never would.  However, bad data on disk can happen from time to 
 time, because of hardware errors or misconfigurations.  In the past we have 
 had to correct it manually, which is time-consuming and which can result in 
 downtime.
 Recovery mode is initialized by the system administrator.  When the NameNode 
 starts up in Recovery Mode, it will try to load the FSImage file, apply all 
 the edits from the edits log, and then write out a new image.  Then it will 
 shut down.
 Unlike in the normal startup process, the recovery mode startup process will 
 be interactive.  When the NameNode finds something that is inconsistent, it 
 will prompt the operator as to what it should do.   The operator can also 
 choose to take the first option for all prompts by starting up with the '-f' 
 flag, or typing 'a' at one of the prompts.
 I have reused as much code as possible from the NameNode in this tool.  
 Hopefully, the effort that was spent developing this will also make the 
 NameNode editLog and image processing even more robust than it already is.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3056) Add an interface for DataBlockScanner logging

2012-03-09 Thread Suresh Srinivas (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226701#comment-13226701
 ] 

Suresh Srinivas commented on HDFS-3056:
---

+1 for the patch.

 Add an interface for DataBlockScanner logging
 -

 Key: HDFS-3056
 URL: https://issues.apache.org/jira/browse/HDFS-3056
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h3056_20120306.patch, h3056_20120307.patch, 
 h3056_20120307b.patch


 Some methods in the FSDatasetInterface are used only for logging in 
 DataBlockScanner.  These methods should be separated out to an new interface.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3004) Implement Recovery Mode

2012-03-09 Thread Colin Patrick McCabe (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3004:
---

Attachment: HDFS-3004.010.patch

fix diff again, sigh

 Implement Recovery Mode
 ---

 Key: HDFS-3004
 URL: https://issues.apache.org/jira/browse/HDFS-3004
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: tools
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-3004.010.patch, 
 HDFS-3004__namenode_recovery_tool.txt


 When the NameNode metadata is corrupt for some reason, we want to be able to 
 fix it.  Obviously, we would prefer never to get in this case.  In a perfect 
 world, we never would.  However, bad data on disk can happen from time to 
 time, because of hardware errors or misconfigurations.  In the past we have 
 had to correct it manually, which is time-consuming and which can result in 
 downtime.
 Recovery mode is initialized by the system administrator.  When the NameNode 
 starts up in Recovery Mode, it will try to load the FSImage file, apply all 
 the edits from the edits log, and then write out a new image.  Then it will 
 shut down.
 Unlike in the normal startup process, the recovery mode startup process will 
 be interactive.  When the NameNode finds something that is inconsistent, it 
 will prompt the operator as to what it should do.   The operator can also 
 choose to take the first option for all prompts by starting up with the '-f' 
 flag, or typing 'a' at one of the prompts.
 I have reused as much code as possible from the NameNode in this tool.  
 Hopefully, the effort that was spent developing this will also make the 
 NameNode editLog and image processing even more robust than it already is.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3004) Implement Recovery Mode

2012-03-09 Thread Colin Patrick McCabe (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3004:
---

Attachment: (was: HDFS-3004.009.patch)

 Implement Recovery Mode
 ---

 Key: HDFS-3004
 URL: https://issues.apache.org/jira/browse/HDFS-3004
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: tools
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
 Attachments: HDFS-3004.010.patch, 
 HDFS-3004__namenode_recovery_tool.txt


 When the NameNode metadata is corrupt for some reason, we want to be able to 
 fix it.  Obviously, we would prefer never to get in this case.  In a perfect 
 world, we never would.  However, bad data on disk can happen from time to 
 time, because of hardware errors or misconfigurations.  In the past we have 
 had to correct it manually, which is time-consuming and which can result in 
 downtime.
 Recovery mode is initialized by the system administrator.  When the NameNode 
 starts up in Recovery Mode, it will try to load the FSImage file, apply all 
 the edits from the edits log, and then write out a new image.  Then it will 
 shut down.
 Unlike in the normal startup process, the recovery mode startup process will 
 be interactive.  When the NameNode finds something that is inconsistent, it 
 will prompt the operator as to what it should do.   The operator can also 
 choose to take the first option for all prompts by starting up with the '-f' 
 flag, or typing 'a' at one of the prompts.
 I have reused as much code as possible from the NameNode in this tool.  
 Hopefully, the effort that was spent developing this will also make the 
 NameNode editLog and image processing even more robust than it already is.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3056) Add an interface for DataBlockScanner logging

2012-03-09 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226713#comment-13226713
 ] 

Hudson commented on HDFS-3056:
--

Integrated in Hadoop-Hdfs-trunk-Commit #1938 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1938/])
HDFS-3056: add the new file for the previous commit. (Revision 1299144)
HDFS-3056.  Add a new interface RollingLogs for DataBlockScanner logging. 
(Revision 1299139)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1299144
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/RollingLogs.java

szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1299139
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceScanner.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataBlockScanner.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDatasetInterface.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDatanodeBlockScanner.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java


 Add an interface for DataBlockScanner logging
 -

 Key: HDFS-3056
 URL: https://issues.apache.org/jira/browse/HDFS-3056
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h3056_20120306.patch, h3056_20120307.patch, 
 h3056_20120307b.patch


 Some methods in the FSDatasetInterface are used only for logging in 
 DataBlockScanner.  These methods should be separated out to an new interface.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3056) Add an interface for DataBlockScanner logging

2012-03-09 Thread Tsz Wo (Nicholas), SZE (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3056:
-

   Resolution: Fixed
Fix Version/s: 0.23.3
   0.24.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks for the review, Suresh.

I have committed this to trunk and 0.23.

 Add an interface for DataBlockScanner logging
 -

 Key: HDFS-3056
 URL: https://issues.apache.org/jira/browse/HDFS-3056
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0, 0.23.3

 Attachments: h3056_20120306.patch, h3056_20120307.patch, 
 h3056_20120307b.patch


 Some methods in the FSDatasetInterface are used only for logging in 
 DataBlockScanner.  These methods should be separated out to an new interface.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3056) Add an interface for DataBlockScanner logging

2012-03-09 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226719#comment-13226719
 ] 

Hudson commented on HDFS-3056:
--

Integrated in Hadoop-Common-trunk-Commit #1863 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1863/])
HDFS-3056: add the new file for the previous commit. (Revision 1299144)
HDFS-3056.  Add a new interface RollingLogs for DataBlockScanner logging. 
(Revision 1299139)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1299144
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/RollingLogs.java

szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1299139
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceScanner.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataBlockScanner.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDatasetInterface.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDatanodeBlockScanner.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java


 Add an interface for DataBlockScanner logging
 -

 Key: HDFS-3056
 URL: https://issues.apache.org/jira/browse/HDFS-3056
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0, 0.23.3

 Attachments: h3056_20120306.patch, h3056_20120307.patch, 
 h3056_20120307b.patch


 Some methods in the FSDatasetInterface are used only for logging in 
 DataBlockScanner.  These methods should be separated out to an new interface.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3050) refactor OEV to share more code with the NameNode

2012-03-09 Thread Colin Patrick McCabe (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3050:
---

Attachment: HDFS-3050.006.patch

address Eli's suggestions

 refactor OEV to share more code with the NameNode
 -

 Key: HDFS-3050
 URL: https://issues.apache.org/jira/browse/HDFS-3050
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-3050.006.patch


 Current, OEV (the offline edits viewer) re-implements all of the opcode 
 parsing logic found in the NameNode.  This duplicated code creates a 
 maintenance burden for us.
 OEV should be refactored to simply use the normal EditLog parsing code, 
 rather than rolling its own.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3050) refactor OEV to share more code with the NameNode

2012-03-09 Thread Colin Patrick McCabe (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3050:
---

Attachment: (was: HDFS-3050.004.patch)

 refactor OEV to share more code with the NameNode
 -

 Key: HDFS-3050
 URL: https://issues.apache.org/jira/browse/HDFS-3050
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-3050.006.patch


 Current, OEV (the offline edits viewer) re-implements all of the opcode 
 parsing logic found in the NameNode.  This duplicated code creates a 
 maintenance burden for us.
 OEV should be refactored to simply use the normal EditLog parsing code, 
 rather than rolling its own.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3056) Add an interface for DataBlockScanner logging

2012-03-09 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226721#comment-13226721
 ] 

Hudson commented on HDFS-3056:
--

Integrated in Hadoop-Hdfs-0.23-Commit #660 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Commit/660/])
Merge r1299139 and r1299144 from trunk for HDFS-3056. (Revision 1299146)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1299146
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceScanner.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataBlockScanner.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDatasetInterface.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/RollingLogs.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDatanodeBlockScanner.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java


 Add an interface for DataBlockScanner logging
 -

 Key: HDFS-3056
 URL: https://issues.apache.org/jira/browse/HDFS-3056
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0, 0.23.3

 Attachments: h3056_20120306.patch, h3056_20120307.patch, 
 h3056_20120307b.patch


 Some methods in the FSDatasetInterface are used only for logging in 
 DataBlockScanner.  These methods should be separated out to an new interface.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3056) Add an interface for DataBlockScanner logging

2012-03-09 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226722#comment-13226722
 ] 

Hudson commented on HDFS-3056:
--

Integrated in Hadoop-Common-0.23-Commit #669 (See 
[https://builds.apache.org/job/Hadoop-Common-0.23-Commit/669/])
Merge r1299139 and r1299144 from trunk for HDFS-3056. (Revision 1299146)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1299146
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceScanner.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataBlockScanner.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDatasetInterface.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/RollingLogs.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDatanodeBlockScanner.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java


 Add an interface for DataBlockScanner logging
 -

 Key: HDFS-3056
 URL: https://issues.apache.org/jira/browse/HDFS-3056
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0, 0.23.3

 Attachments: h3056_20120306.patch, h3056_20120307.patch, 
 h3056_20120307b.patch


 Some methods in the FSDatasetInterface are used only for logging in 
 DataBlockScanner.  These methods should be separated out to an new interface.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3067) NPE in DFSInputStream.readBuffer if read is repeated on corrupted block

2012-03-09 Thread Aaron T. Myers (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226724#comment-13226724
 ] 

Aaron T. Myers commented on HDFS-3067:
--

Looks pretty good to me, Hank. Just a few small nits. +1 once these are 
addressed.

# A few lines are over 80 chars.
# Indent 4 lines on lines that go over 80 chars, instead of 2.
# Rather than use the sawException boolean, add an explicit call to fail() 
after the dis.read(), and call GenericTestUtils.assertExceptionContains(...) in 
the catch clause.
# Put some white space around = and  in the for loop.

 NPE in DFSInputStream.readBuffer if read is repeated on corrupted block
 ---

 Key: HDFS-3067
 URL: https://issues.apache.org/jira/browse/HDFS-3067
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.24.0
Reporter: Henry Robinson
Assignee: Henry Robinson
 Attachments: HDFS-3607.patch


 With a singly-replicated block that's corrupted, issuing a read against it 
 twice in succession (e.g. if ChecksumException is caught by the client) gives 
 a NullPointerException.
 Here's the body of a test that reproduces the problem:
 {code}
 final short REPL_FACTOR = 1;
 final long FILE_LENGTH = 512L;
 cluster.waitActive();
 FileSystem fs = cluster.getFileSystem();
 Path path = new Path(/corrupted);
 DFSTestUtil.createFile(fs, path, FILE_LENGTH, REPL_FACTOR, 12345L);
 DFSTestUtil.waitReplication(fs, path, REPL_FACTOR);
 ExtendedBlock block = DFSTestUtil.getFirstBlock(fs, path);
 int blockFilesCorrupted = cluster.corruptBlockOnDataNodes(block);
 assertEquals(All replicas not corrupted, REPL_FACTOR, 
 blockFilesCorrupted);
 InetSocketAddress nnAddr =
 new InetSocketAddress(localhost, cluster.getNameNodePort());
 DFSClient client = new DFSClient(nnAddr, conf);
 DFSInputStream dis = client.open(path.toString());
 byte[] arr = new byte[(int)FILE_LENGTH];
 boolean sawException = false;
 try {
   dis.read(arr, 0, (int)FILE_LENGTH);
 } catch (ChecksumException ex) { 
   sawException = true;
 }
 
 assertTrue(sawException);
 sawException = false;
 try {
   dis.read(arr, 0, (int)FILE_LENGTH); // -- NPE thrown here
 } catch (ChecksumException ex) { 
   sawException = true;
 } 
 {code}
 The stack:
 {code}
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:492)
   at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:545)
 [snip test stack]
 {code}
 and the problem is that currentNode is null. It's left at null after the 
 first read, which fails, and then is never refreshed because the condition in 
 read that protects blockSeekTo is only triggered if the current position is 
 outside the block's range. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3056) Add an interface for DataBlockScanner logging

2012-03-09 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226730#comment-13226730
 ] 

Hudson commented on HDFS-3056:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #1872 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1872/])
HDFS-3056: add the new file for the previous commit. (Revision 1299144)
HDFS-3056.  Add a new interface RollingLogs for DataBlockScanner logging. 
(Revision 1299139)

 Result = ABORTED
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1299144
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/RollingLogs.java

szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1299139
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceScanner.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataBlockScanner.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDatasetInterface.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDatanodeBlockScanner.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java


 Add an interface for DataBlockScanner logging
 -

 Key: HDFS-3056
 URL: https://issues.apache.org/jira/browse/HDFS-3056
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0, 0.23.3

 Attachments: h3056_20120306.patch, h3056_20120307.patch, 
 h3056_20120307b.patch


 Some methods in the FSDatasetInterface are used only for logging in 
 DataBlockScanner.  These methods should be separated out to an new interface.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3056) Add an interface for DataBlockScanner logging

2012-03-09 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226731#comment-13226731
 ] 

Hudson commented on HDFS-3056:
--

Integrated in Hadoop-Mapreduce-0.23-Commit #677 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/677/])
Merge r1299139 and r1299144 from trunk for HDFS-3056. (Revision 1299146)

 Result = ABORTED
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1299146
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceScanner.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataBlockScanner.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDatasetInterface.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/RollingLogs.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDatanodeBlockScanner.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java


 Add an interface for DataBlockScanner logging
 -

 Key: HDFS-3056
 URL: https://issues.apache.org/jira/browse/HDFS-3056
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0, 0.23.3

 Attachments: h3056_20120306.patch, h3056_20120307.patch, 
 h3056_20120307b.patch


 Some methods in the FSDatasetInterface are used only for logging in 
 DataBlockScanner.  These methods should be separated out to an new interface.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-1512) BlockSender calls deprecated method getReplica

2012-03-09 Thread Uma Maheswara Rao G (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-1512:
--

Attachment: HDFS-1512.patch

Amin, I just re-based your patch based on trunk. Lets trigger Jenkins.

 BlockSender calls deprecated method getReplica
 --

 Key: HDFS-1512
 URL: https://issues.apache.org/jira/browse/HDFS-1512
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Reporter: Eli Collins
Assignee: Amin Bandeali
  Labels: newbie
 Attachments: HDFS-1512.patch, HDFS-1512.patch


 HDFS-680 deprecated FSDatasetInterface#getReplica, however it is still used 
 by BlockSender which still maintains a Replica member.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-1512) BlockSender calls deprecated method getReplica

2012-03-09 Thread Uma Maheswara Rao G (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-1512:
--

Status: Open  (was: Patch Available)

 BlockSender calls deprecated method getReplica
 --

 Key: HDFS-1512
 URL: https://issues.apache.org/jira/browse/HDFS-1512
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Reporter: Eli Collins
Assignee: Amin Bandeali
  Labels: newbie
 Attachments: HDFS-1512.patch, HDFS-1512.patch


 HDFS-680 deprecated FSDatasetInterface#getReplica, however it is still used 
 by BlockSender which still maintains a Replica member.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-1512) BlockSender calls deprecated method getReplica

2012-03-09 Thread Uma Maheswara Rao G (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-1512:
--

Status: Patch Available  (was: Open)

 BlockSender calls deprecated method getReplica
 --

 Key: HDFS-1512
 URL: https://issues.apache.org/jira/browse/HDFS-1512
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Reporter: Eli Collins
Assignee: Amin Bandeali
  Labels: newbie
 Attachments: HDFS-1512.patch, HDFS-1512.patch


 HDFS-680 deprecated FSDatasetInterface#getReplica, however it is still used 
 by BlockSender which still maintains a Replica member.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1512) BlockSender calls deprecated method getReplica

2012-03-09 Thread Amin Bandeali (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226745#comment-13226745
 ] 

Amin Bandeali commented on HDFS-1512:
-

How do I trigger?

On Fri, Mar 9, 2012 at 7:58 PM, Uma Maheswara Rao G (Updated) (JIRA) 



-- 
Amin Bandeali
Cell: 714.757.9544

Follow me on twitter
http://twitter.com/aminbandeali

DISCLAIMER
This e-mail is confidential and intended solely for the use of the
individual to whom it is addressed. If you have received this e-mail in
error please notify me. Although this message and any attachments are
believed to be free of any virus or other defect, it is the responsibility
of the recipient to ensure that it is virus free and no responsibility is
accepted by me for any loss or damage in any way arising from its use.


 BlockSender calls deprecated method getReplica
 --

 Key: HDFS-1512
 URL: https://issues.apache.org/jira/browse/HDFS-1512
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Reporter: Eli Collins
Assignee: Amin Bandeali
  Labels: newbie
 Attachments: HDFS-1512.patch, HDFS-1512.patch


 HDFS-680 deprecated FSDatasetInterface#getReplica, however it is still used 
 by BlockSender which still maintains a Replica member.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1512) BlockSender calls deprecated method getReplica

2012-03-09 Thread Uma Maheswara Rao G (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226744#comment-13226744
 ] 

Uma Maheswara Rao G commented on HDFS-1512:
---

comments for the patch:
 HDFS-2862 idea is, DN classes should not invoke the APIs directly from 
FSDataSet. Now, in your Patch directly casting it to FSDataSet and calling the 
apis.This will break the 2862 contract. So, we may need to add the interface 
method in FSDatasetInterface.?

@Nicholas, since you are the Author for HDFS-2862,  could you please comment on 
this point?

 BlockSender calls deprecated method getReplica
 --

 Key: HDFS-1512
 URL: https://issues.apache.org/jira/browse/HDFS-1512
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Reporter: Eli Collins
Assignee: Amin Bandeali
  Labels: newbie
 Attachments: HDFS-1512.patch, HDFS-1512.patch


 HDFS-680 deprecated FSDatasetInterface#getReplica, however it is still used 
 by BlockSender which still maintains a Replica member.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1512) BlockSender calls deprecated method getReplica

2012-03-09 Thread Uma Maheswara Rao G (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226746#comment-13226746
 ] 

Uma Maheswara Rao G commented on HDFS-1512:
---

@Amin, I already re submitted your patch, Automatically it will pick the patch. 
Also could you please avoid pasting unnecessary content in comments? ex: email 
DISCLAIMER. :-)


Thanks
Uma

 BlockSender calls deprecated method getReplica
 --

 Key: HDFS-1512
 URL: https://issues.apache.org/jira/browse/HDFS-1512
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Reporter: Eli Collins
Assignee: Amin Bandeali
  Labels: newbie
 Attachments: HDFS-1512.patch, HDFS-1512.patch


 HDFS-680 deprecated FSDatasetInterface#getReplica, however it is still used 
 by BlockSender which still maintains a Replica member.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3063) NameNode should validate all coming file path

2012-03-09 Thread Denny Ye (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226748#comment-13226748
 ] 

Denny Ye commented on HDFS-3063:


Thank you, Daryn. We have same concerns about the maintainability of NameNode. 
It's better to encapsulate all the validation for each interface method at 
NameNode using common method. Another problem of this case is there should be 
similar validation for all coming methods.

 NameNode should validate all coming file path
 -

 Key: HDFS-3063
 URL: https://issues.apache.org/jira/browse/HDFS-3063
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 0.20.205.0
Reporter: Denny Ye
Priority: Minor
  Labels: namenode
 Attachments: HDFS-3063.patch


 NameNode provides RPC service for not only DFS client but also user defined 
 program. A common case we always met is that user transfers file path 
 prefixed with HDFS protocol(hdfs://{namenode:{port}}/{folder}/{file}). 
 NameNode cannot map node meta-data with this path and always throw NPE. In 
 user client, we only see the NullPointerException, no other tips for which 
 step it occurs. 
 Also, NameNode should validate all coming file path with regular format.
 One exception I met:
 Exception in thread main org.apache.hadoop.ipc.RemoteException: 
 java.io.IOException: java.lang.NullPointerException
   at 
 org.apache.hadoop.hdfs.server.namenode.INode.getPathComponents(INode.java:334)
   at 
 org.apache.hadoop.hdfs.server.namenode.INode.getPathComponents(INode.java:329)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1512) BlockSender calls deprecated method getReplica

2012-03-09 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226778#comment-13226778
 ] 

Hadoop QA commented on HDFS-1512:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12517835/HDFS-1512.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests:
  
org.apache.hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes
  org.apache.hadoop.hdfs.TestSmallBlock
  org.apache.hadoop.hdfs.TestDFSStartupVersions
  org.apache.hadoop.hdfs.TestDFSShellGenericOptions
  org.apache.hadoop.hdfs.TestModTime
  org.apache.hadoop.hdfs.TestPread

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/1983//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1983//console

This message is automatically generated.

 BlockSender calls deprecated method getReplica
 --

 Key: HDFS-1512
 URL: https://issues.apache.org/jira/browse/HDFS-1512
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Reporter: Eli Collins
Assignee: Amin Bandeali
  Labels: newbie
 Attachments: HDFS-1512.patch, HDFS-1512.patch


 HDFS-680 deprecated FSDatasetInterface#getReplica, however it is still used 
 by BlockSender which still maintains a Replica member.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-1512) BlockSender calls deprecated method getReplica

2012-03-09 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226787#comment-13226787
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1512:
--

 @Nicholas, since you are the Author for HDFS-2862, could you please comment 
 on this point?

- Please don't cast it to FSDataset and change the interface if necessary.

- The use of getReplica(..) in BlockSender cannot easily be removed.  It is not 
a public API.  We could have removed it directly.  This simple patch won't work 
as indicated by the unit test results.  I think it needs a bigger change of the 
code.

 BlockSender calls deprecated method getReplica
 --

 Key: HDFS-1512
 URL: https://issues.apache.org/jira/browse/HDFS-1512
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Reporter: Eli Collins
Assignee: Amin Bandeali
  Labels: newbie
 Attachments: HDFS-1512.patch, HDFS-1512.patch


 HDFS-680 deprecated FSDatasetInterface#getReplica, however it is still used 
 by BlockSender which still maintains a Replica member.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira