date:20120315


[ 
https://issues.apache.org/jira/browse/HDFS-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229930#comment-13229930
 ] 

Hudson commented on HDFS-3093:
--

Integrated in Hadoop-Common-0.23-Commit #681 (See 
[https://builds.apache.org/job/Hadoop-Common-0.23-Commit/681/])
HDFS-3093. Fix bug where namenode -format interpreted the -force flag in 
reverse. Contributed by Todd Lipcon. (Revision 1300813)

 Result = SUCCESS
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1300813
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java


 TestAllowFormat is trying to be interactive
 ---

 Key: HDFS-3093
 URL: https://issues.apache.org/jira/browse/HDFS-3093
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.24.0, 0.23.3
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.24.0, 0.23.3

 Attachments: hdfs-3039.txt, hdfs-3093.txt, hdfs-3093.txt


 HDFS-2731 broke TestAllowFormat such that it now tries to prompt the user, 
 which of course hangs forever.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3094) add -nonInteractive and -force option to namenode -format command


 [ 
https://issues.apache.org/jira/browse/HDFS-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-3094:
--

Attachment: HDFS-3094.branch-1.0.patch

added another test and cleaned up comments.

 add -nonInteractive and -force option to namenode -format command
 -

 Key: HDFS-3094
 URL: https://issues.apache.org/jira/browse/HDFS-3094
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.24.0, 1.0.2
Reporter: Arpit Gupta
Assignee: Arpit Gupta
 Attachments: HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch


 Currently the bin/hadoop namenode -format prompts the user for a Y/N to setup 
 the directories in the local file system.
 -force : namenode formats the directories without prompting
 -nonInterActive : namenode format will return with an exit code of 1 if the 
 dir exists.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3093) TestAllowFormat is trying to be interactive


[ 
https://issues.apache.org/jira/browse/HDFS-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229936#comment-13229936
 ] 

Hudson commented on HDFS-3093:
--

Integrated in Hadoop-Mapreduce-0.23-Commit #689 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/689/])
HDFS-3093. Fix bug where namenode -format interpreted the -force flag in 
reverse. Contributed by Todd Lipcon. (Revision 1300813)

 Result = ABORTED
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1300813
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java


 TestAllowFormat is trying to be interactive
 ---

 Key: HDFS-3093
 URL: https://issues.apache.org/jira/browse/HDFS-3093
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.24.0, 0.23.3
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.24.0, 0.23.3

 Attachments: hdfs-3039.txt, hdfs-3093.txt, hdfs-3093.txt


 HDFS-2731 broke TestAllowFormat such that it now tries to prompt the user, 
 which of course hangs forever.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3093) TestAllowFormat is trying to be interactive

2012-03-15 Thread Uma Maheswara Rao G (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229935#comment-13229935
 ] 

Hudson commented on HDFS-3093:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #1884 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1884/])
HDFS-3093. Fix bug where namenode -format interpreted the -force flag in 
reverse. Contributed by Todd Lipcon. (Revision 1300814)

 Result = ABORTED
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1300814
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java


 TestAllowFormat is trying to be interactive
 ---

 Key: HDFS-3093
 URL: https://issues.apache.org/jira/browse/HDFS-3093
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.24.0, 0.23.3
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.24.0, 0.23.3

 Attachments: hdfs-3039.txt, hdfs-3093.txt, hdfs-3093.txt


 HDFS-2731 broke TestAllowFormat such that it now tries to prompt the user, 
 which of course hangs forever.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HDFS-1795) Port 0.20-append changes onto 0.20-security-203

2012-03-15 Thread Eli Collins (Resolved) (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-1795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Eli Collins resolved HDFS-1795.
---

Resolution: Fixed
Fix Version/s: 0.20.205.0

Port 0.20-append changes onto 0.20-security-203
---

Key: HDFS-1795
URL: https://issues.apache.org/jira/browse/HDFS-1795
Project: Hadoop HDFS
Issue Type: Task
Reporter: Andrew Purtell
Fix For: 0.20.205.0

Attachments: security-append-patches.zip

Port 0.20-append changes onto 0.20-security-203.
I started with a Git repository cloned from
git://git.apache.org/hadoop-common.git . Branch 'branch-0.20-security-203'
was used as the starting point for the work. I then enumerated over the
0.20-append specific patches in 'branch-0.20-append'. Each was applied if not
already via cherry pick except for as noted below. This process in effect
replayed the evolution of 0.20-append branch on top of 0.20-security-203.
The specific functional changes that HBase absolutely relies upon are
specially mentioned.
Generally I ran the full test suite after each change. There were a couple of
exceptions where pairs of adjacent change sets were strongly related, in
which case I applied them in sequence, then ran the test suite. During this
process I encountered no test failures except for one test in
TestFileAppend4, a test brought in from the append branch, and I still need
to dig in to see if this is a real problem or if the test needs to be changed
to work on top of security-203.
{noformat}
commit b9ad012eaf3915c2169a02a7130b54cbcc1d8a89
Author: Dhruba Borthakur dhr...@apache.org
Date: Fri Jun 4 07:20:10 2010 +
HDFS-200. Support append and sync for hadoop 0.20 branch.

Required for HBase
commit c968e11b5a60fc6f28e4e43fbbc8a99e7e49a659
Author: Dhruba Borthakur dhr...@apache.org
Date: Wed Jun 9 23:09:07 2010 +
HDFS-101. DFSClient correctly detects second datanode failure in write
pipeline. (Nicolas Spiegelberg via dhruba)
Excluded
Already in 0.20-security-203 according to search of Git change log
commit 9f7e5ed2ff47444a1dcd12ed34796929d5b9f7d5
Author: Dhruba Borthakur dhr...@apache.org
Date: Wed Jun 9 23:12:21 2010 +
HDFS-988. Fix bug where savenameSpace can corrupt edits log.
(Nicolas Spiegelberg via dhruba)
commit dfbbd6fbadaa95c54a1040b4fe8854b1b858d7a5
Author: Dhruba Borthakur dhr...@apache.org
Date: Thu Jun 10 18:46:03 2010 +
HDFS-826. Allow a mechanism for an application to detect that
datanode(s) have died in the write pipeline. (dhruba)
Required for HBase
Commit be8d32503d30208a2d7772b3b4b2a270938a4004
Author: Dhruba Borthakur dhr...@apache.org
Date: Thu Jun 10 22:25:39 2010 +
HDFS-142. Blocks that are being written by a client are stored in the
blocksBeingWritten directory.
(Dhruba Borthakur, Nicolas Spiegelberg, Todd Lipcon via dhruba)
commit 856efc2e95aaacc597d669c1b053634ff752dbec
Author: Dhruba Borthakur dhr...@apache.org
Date: Fri Jun 11 00:48:41 2010 +
HDFS-630. Client can exclude specific nodes in the write pipeline.
(Nicolas Spiegelberg via dhruba)
Required for HBase
commit 2da1a05fc0cc0429229e87694977bae2ba370625
Author: Dhruba Borthakur dhr...@apache.org
Date: Fri Jun 11 01:02:13 2010 +
HDFS-457. Better handling of volume failure in DataNode Storage.
(Nicolas Spiegelberg via dhruba)
Excluded
Already in 0.20-security-203 according to search of Git change log
commit bd42393cd3a3a731ea98b25ddb528ad03a1ab4af
Author: Dhruba Borthakur dhr...@apache.org
Date: Fri Jun 11 23:37:38 2010 +
HDFS-1054. remove sleep before retry for allocating a block.
(Todd Lipcon via dhruba)
commit 120441b9e571a5703ac39b47608e87182f0f4972
Author: Dhruba Borthakur dhr...@apache.org
Date: Wed Jun 16 20:53:12 2010 +
HDFS-445. pread should refetch block locations when necessary.
(Todd Lipcon via dhruba)
Excluded
Already in 0.20-security-203 according to search of Git change log
commit 2004aa453ba6b7ee2045093ba313ef8551a7f8da
Author: Dhruba Borthakur dhr...@apache.org
Date: Wed Jun 16 20:59:10 2010 +
HDFS-561. Fix write pipeline
commit 2a8227b0e6be8937fc4a654899be2a22c1f6efbe
Author: Dhruba Borthakur dhr...@apache.org
Date: Wed Jun 16 21:13:24 2010 +
HDFS-927. DFSInputStream retries too many times for new block
locations. (Todd Lipcon via dhruba)
Excluded
Already in 0.20-security-203 according to search of Git change log
commit b1e49dbf50a429cf01b636caa2666ff81ed2a016
Author: Dhruba Borthakur dhr...@apache.org
Date: Wed Jun 16 21:21:45 2010 +
HDFS-1215. Fix unti test TestNodeCount.
(Todd Lipcon via dhruba)

[jira] [Commented] (HDFS-3091) Failed to add new DataNode in pipeline and will be resulted into write failure.

[
https://issues.apache.org/jira/browse/HDFS-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229951#comment-13229951
]

Uma Maheswara Rao G commented on HDFS-3091:
---

I am thinking about this issue,
I think if the *replication* is greater or equal to the total live nodes in the
cluser, then we need not go for replacing the pipeline with new node.

But here we may need to do RPC for checking the live nodes in the cluster.

Any alternative solutions?

Also other case is, if NN not able to choose any extra nodes due to DN exeiver
count or other factors, then also it may fail right? In this case, i feel that
sanity check itself may not be correct check right. Can't we simply proceed as
normal behaviour when we are not able to find any new node from NN? Why do we
need to strctily ensure one extra node and make write failure?

@Nocholas, Since you are the author for this feature, I need your opinion.

Failed to add new DataNode in pipeline and will be resulted into write
failure.
---

Key: HDFS-3091
URL: https://issues.apache.org/jira/browse/HDFS-3091
Project: Hadoop HDFS
Issue Type: Bug
Components: data-node, hdfs client, name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: Uma Maheswara Rao G

When verifying the HDFS-1606 feature, Observed couple of issues.
Presently the ReplaceDatanodeOnFailure policy satisfies even though we dont
have enough DN to replcae in cluster and will be resulted into write failure.
{quote}
12/03/13 14:27:12 WARN hdfs.DFSClient: DataStreamer Exception
java.io.IOException: Failed to add a datanode: nodes.length !=
original.length + 1, nodes=[10.18.52.55:50010], original=[10.18.52.55:50010]
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:834)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:741)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:416)
{quote}
Lets take some cases:
1) Replication factor 3 and cluster size also 3 and unportunately pipeline
drops to 1.
ReplaceDatanodeOnFailure will be satisfied because *existings(1)=
replication/2 (3/2==1)*.
But when it finding the new node to replace obiously it can not find the new
node and the sanity check will fail.
This will be resulted to Wite failure.
2) Replication factor 10 (accidentally user sets the replication factor to
higher value than cluster size),
Cluser has only 5 datanodes.
Here even if one node fails also write will fail with same reason.
Because pipeline max will be 5 and killed one datanode, then existings will
be 4
*existings(4)= replication/2(10/2==5)* will be satisfied and obiously it
can not replace with the new node as there is no extra nodes exist in the
cluster. This will be resulted to write failure.
3) sync realted opreations also fails in this situations ( will post the
clear scenarios)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3067) NPE in DFSInputStream.readBuffer if read is repeated on corrupted block


[ 
https://issues.apache.org/jira/browse/HDFS-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229980#comment-13229980
 ] 

Hadoop QA commented on HDFS-3067:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12517987/HDFS-3067.1.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests:
  org.apache.hadoop.cli.TestHDFSCLI
  org.apache.hadoop.hdfs.TestDatanodeBlockScanner

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2008//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2008//console

This message is automatically generated.

 NPE in DFSInputStream.readBuffer if read is repeated on corrupted block
 ---

 Key: HDFS-3067
 URL: https://issues.apache.org/jira/browse/HDFS-3067
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.24.0
Reporter: Henry Robinson
Assignee: Henry Robinson
 Attachments: HDFS-3067.1.patch, HDFS-3607.patch


 With a singly-replicated block that's corrupted, issuing a read against it 
 twice in succession (e.g. if ChecksumException is caught by the client) gives 
 a NullPointerException.
 Here's the body of a test that reproduces the problem:
 {code}
 final short REPL_FACTOR = 1;
 final long FILE_LENGTH = 512L;
 cluster.waitActive();
 FileSystem fs = cluster.getFileSystem();
 Path path = new Path(/corrupted);
 DFSTestUtil.createFile(fs, path, FILE_LENGTH, REPL_FACTOR, 12345L);
 DFSTestUtil.waitReplication(fs, path, REPL_FACTOR);
 ExtendedBlock block = DFSTestUtil.getFirstBlock(fs, path);
 int blockFilesCorrupted = cluster.corruptBlockOnDataNodes(block);
 assertEquals(All replicas not corrupted, REPL_FACTOR, 
 blockFilesCorrupted);
 InetSocketAddress nnAddr =
 new InetSocketAddress(localhost, cluster.getNameNodePort());
 DFSClient client = new DFSClient(nnAddr, conf);
 DFSInputStream dis = client.open(path.toString());
 byte[] arr = new byte[(int)FILE_LENGTH];
 boolean sawException = false;
 try {
   dis.read(arr, 0, (int)FILE_LENGTH);
 } catch (ChecksumException ex) { 
   sawException = true;
 }
 
 assertTrue(sawException);
 sawException = false;
 try {
   dis.read(arr, 0, (int)FILE_LENGTH); // -- NPE thrown here
 } catch (ChecksumException ex) { 
   sawException = true;
 } 
 {code}
 The stack:
 {code}
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:492)
   at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:545)
 [snip test stack]
 {code}
 and the problem is that currentNode is null. It's left at null after the 
 first read, which fails, and then is never refreshed because the condition in 
 read that protects blockSeekTo is only triggered if the current position is 
 outside the block's range. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3057) httpfs and hdfs launcher scripts should honor CATALINA_HOME and HADOOP_LIBEXEC_DIR


[ 
https://issues.apache.org/jira/browse/HDFS-3057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230121#comment-13230121
 ] 

Hudson commented on HDFS-3057:
--

Integrated in Hadoop-Hdfs-trunk #985 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/985/])
HDFS-3057. httpfs and hdfs launcher scripts should honor CATALINA_HOME and 
HADOOP_LIBEXEC_DIR (rvs via tucu) (Revision 1300637)

 Result = UNSTABLE
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1300637
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/libexec/httpfs-config.sh
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/sbin/httpfs.sh
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 httpfs and hdfs launcher scripts should honor CATALINA_HOME and 
 HADOOP_LIBEXEC_DIR
 --

 Key: HDFS-3057
 URL: https://issues.apache.org/jira/browse/HDFS-3057
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.23.1
Reporter: Roman Shaposhnik
Assignee: Roman Shaposhnik
 Fix For: 0.23.3

 Attachments: HDFS-3057.patch.txt


 In sbin/httpfs.sh the following should use CATALINA_HOME:
 {noformat}
 if [ ${HTTPFS_SILENT} != true ]; then
   ${CATALINA_BASE:-${BASEDIR}/share/hadoop/httpfs/tomcat}/bin/catalina.sh 
 $@
 else
   ${CATALINA_BASE:-${BASEDIR}/share/hadoop/httpfs/tomcat}/bin/catalina.sh 
 $@  /dev/null
 fi
 {noformat}
 and the following should honor HADOOP_LIBEXEC_DIR:
 {noformat}
 source ${BASEDIR}/libexec/httpfs-config.sh
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3093) TestAllowFormat is trying to be interactive


[ 
https://issues.apache.org/jira/browse/HDFS-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230123#comment-13230123
 ] 

Hudson commented on HDFS-3093:
--

Integrated in Hadoop-Hdfs-trunk #985 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/985/])
HDFS-3093. Fix bug where namenode -format interpreted the -force flag in 
reverse. Contributed by Todd Lipcon. (Revision 1300814)

 Result = UNSTABLE
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1300814
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java


 TestAllowFormat is trying to be interactive
 ---

 Key: HDFS-3093
 URL: https://issues.apache.org/jira/browse/HDFS-3093
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.24.0, 0.23.3
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.24.0, 0.23.3

 Attachments: hdfs-3039.txt, hdfs-3093.txt, hdfs-3093.txt


 HDFS-2731 broke TestAllowFormat such that it now tries to prompt the user, 
 which of course hangs forever.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3057) httpfs and hdfs launcher scripts should honor CATALINA_HOME and HADOOP_LIBEXEC_DIR


[ 
https://issues.apache.org/jira/browse/HDFS-3057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230130#comment-13230130
 ] 

Hudson commented on HDFS-3057:
--

Integrated in Hadoop-Hdfs-0.23-Build #198 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/198/])
Merge -r 1300636:1300637 from trunk to branch. FIXES: HDFS-3057 (Revision 
1300641)

 Result = UNSTABLE
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1300641
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/libexec/httpfs-config.sh
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/sbin/httpfs.sh
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 httpfs and hdfs launcher scripts should honor CATALINA_HOME and 
 HADOOP_LIBEXEC_DIR
 --

 Key: HDFS-3057
 URL: https://issues.apache.org/jira/browse/HDFS-3057
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.23.1
Reporter: Roman Shaposhnik
Assignee: Roman Shaposhnik
 Fix For: 0.23.3

 Attachments: HDFS-3057.patch.txt


 In sbin/httpfs.sh the following should use CATALINA_HOME:
 {noformat}
 if [ ${HTTPFS_SILENT} != true ]; then
   ${CATALINA_BASE:-${BASEDIR}/share/hadoop/httpfs/tomcat}/bin/catalina.sh 
 $@
 else
   ${CATALINA_BASE:-${BASEDIR}/share/hadoop/httpfs/tomcat}/bin/catalina.sh 
 $@  /dev/null
 fi
 {noformat}
 and the following should honor HADOOP_LIBEXEC_DIR:
 {noformat}
 source ${BASEDIR}/libexec/httpfs-config.sh
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3093) TestAllowFormat is trying to be interactive


[ 
https://issues.apache.org/jira/browse/HDFS-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230132#comment-13230132
 ] 

Hudson commented on HDFS-3093:
--

Integrated in Hadoop-Hdfs-0.23-Build #198 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/198/])
HDFS-3093. Fix bug where namenode -format interpreted the -force flag in 
reverse. Contributed by Todd Lipcon. (Revision 1300813)

 Result = UNSTABLE
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1300813
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java


 TestAllowFormat is trying to be interactive
 ---

 Key: HDFS-3093
 URL: https://issues.apache.org/jira/browse/HDFS-3093
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.24.0, 0.23.3
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.24.0, 0.23.3

 Attachments: hdfs-3039.txt, hdfs-3093.txt, hdfs-3093.txt


 HDFS-2731 broke TestAllowFormat such that it now tries to prompt the user, 
 which of course hangs forever.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3057) httpfs and hdfs launcher scripts should honor CATALINA_HOME and HADOOP_LIBEXEC_DIR


[ 
https://issues.apache.org/jira/browse/HDFS-3057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230152#comment-13230152
 ] 

Hudson commented on HDFS-3057:
--

Integrated in Hadoop-Mapreduce-0.23-Build #226 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/226/])
Merge -r 1300636:1300637 from trunk to branch. FIXES: HDFS-3057 (Revision 
1300641)

 Result = FAILURE
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1300641
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/libexec/httpfs-config.sh
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/sbin/httpfs.sh
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 httpfs and hdfs launcher scripts should honor CATALINA_HOME and 
 HADOOP_LIBEXEC_DIR
 --

 Key: HDFS-3057
 URL: https://issues.apache.org/jira/browse/HDFS-3057
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.23.1
Reporter: Roman Shaposhnik
Assignee: Roman Shaposhnik
 Fix For: 0.23.3

 Attachments: HDFS-3057.patch.txt


 In sbin/httpfs.sh the following should use CATALINA_HOME:
 {noformat}
 if [ ${HTTPFS_SILENT} != true ]; then
   ${CATALINA_BASE:-${BASEDIR}/share/hadoop/httpfs/tomcat}/bin/catalina.sh 
 $@
 else
   ${CATALINA_BASE:-${BASEDIR}/share/hadoop/httpfs/tomcat}/bin/catalina.sh 
 $@  /dev/null
 fi
 {noformat}
 and the following should honor HADOOP_LIBEXEC_DIR:
 {noformat}
 source ${BASEDIR}/libexec/httpfs-config.sh
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3093) TestAllowFormat is trying to be interactive


[ 
https://issues.apache.org/jira/browse/HDFS-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230154#comment-13230154
 ] 

Hudson commented on HDFS-3093:
--

Integrated in Hadoop-Mapreduce-0.23-Build #226 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/226/])
HDFS-3093. Fix bug where namenode -format interpreted the -force flag in 
reverse. Contributed by Todd Lipcon. (Revision 1300813)

 Result = FAILURE
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1300813
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java


 TestAllowFormat is trying to be interactive
 ---

 Key: HDFS-3093
 URL: https://issues.apache.org/jira/browse/HDFS-3093
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.24.0, 0.23.3
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.24.0, 0.23.3

 Attachments: hdfs-3039.txt, hdfs-3093.txt, hdfs-3093.txt


 HDFS-2731 broke TestAllowFormat such that it now tries to prompt the user, 
 which of course hangs forever.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3057) httpfs and hdfs launcher scripts should honor CATALINA_HOME and HADOOP_LIBEXEC_DIR


[ 
https://issues.apache.org/jira/browse/HDFS-3057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230177#comment-13230177
 ] 

Hudson commented on HDFS-3057:
--

Integrated in Hadoop-Mapreduce-trunk #1020 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1020/])
HDFS-3057. httpfs and hdfs launcher scripts should honor CATALINA_HOME and 
HADOOP_LIBEXEC_DIR (rvs via tucu) (Revision 1300637)

 Result = FAILURE
tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1300637
Files : 
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/libexec/httpfs-config.sh
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/sbin/httpfs.sh
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 httpfs and hdfs launcher scripts should honor CATALINA_HOME and 
 HADOOP_LIBEXEC_DIR
 --

 Key: HDFS-3057
 URL: https://issues.apache.org/jira/browse/HDFS-3057
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.23.1
Reporter: Roman Shaposhnik
Assignee: Roman Shaposhnik
 Fix For: 0.23.3

 Attachments: HDFS-3057.patch.txt


 In sbin/httpfs.sh the following should use CATALINA_HOME:
 {noformat}
 if [ ${HTTPFS_SILENT} != true ]; then
   ${CATALINA_BASE:-${BASEDIR}/share/hadoop/httpfs/tomcat}/bin/catalina.sh 
 $@
 else
   ${CATALINA_BASE:-${BASEDIR}/share/hadoop/httpfs/tomcat}/bin/catalina.sh 
 $@  /dev/null
 fi
 {noformat}
 and the following should honor HADOOP_LIBEXEC_DIR:
 {noformat}
 source ${BASEDIR}/libexec/httpfs-config.sh
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3093) TestAllowFormat is trying to be interactive


[ 
https://issues.apache.org/jira/browse/HDFS-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230179#comment-13230179
 ] 

Hudson commented on HDFS-3093:
--

Integrated in Hadoop-Mapreduce-trunk #1020 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1020/])
HDFS-3093. Fix bug where namenode -format interpreted the -force flag in 
reverse. Contributed by Todd Lipcon. (Revision 1300814)

 Result = FAILURE
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1300814
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java


 TestAllowFormat is trying to be interactive
 ---

 Key: HDFS-3093
 URL: https://issues.apache.org/jira/browse/HDFS-3093
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.24.0, 0.23.3
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.24.0, 0.23.3

 Attachments: hdfs-3039.txt, hdfs-3093.txt, hdfs-3093.txt


 HDFS-2731 broke TestAllowFormat such that it now tries to prompt the user, 
 which of course hangs forever.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3094) add -nonInteractive and -force option to namenode -format command


 [ 
https://issues.apache.org/jira/browse/HDFS-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-3094:
--

Attachment: HDFS-3094.branch-1.0.patch

updated tests to check for non existence for version file when format command 
did not succeed.

 add -nonInteractive and -force option to namenode -format command
 -

 Key: HDFS-3094
 URL: https://issues.apache.org/jira/browse/HDFS-3094
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.24.0, 1.0.2
Reporter: Arpit Gupta
Assignee: Arpit Gupta
 Attachments: HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, 
 HDFS-3094.branch-1.0.patch


 Currently the bin/hadoop namenode -format prompts the user for a Y/N to setup 
 the directories in the local file system.
 -force : namenode formats the directories without prompting
 -nonInterActive : namenode format will return with an exit code of 1 if the 
 dir exists.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3067) NPE in DFSInputStream.readBuffer if read is repeated on corrupted block

2012-03-15 Thread Henry Robinson (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230304#comment-13230304
 ] 

Henry Robinson commented on HDFS-3067:
--

There are two test failures:

* TestHDFSCli looks like an inherited failure - it's been failing in other 
pre-commit builds.
* TestDatanodeBlockScanner passes every time for me locally. The test result 
makes it look like the standard corrupt-a-block mechanism failed by hitting a 
timeout. Could this be environmental?

 NPE in DFSInputStream.readBuffer if read is repeated on corrupted block
 ---

 Key: HDFS-3067
 URL: https://issues.apache.org/jira/browse/HDFS-3067
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.24.0
Reporter: Henry Robinson
Assignee: Henry Robinson
 Attachments: HDFS-3067.1.patch, HDFS-3607.patch


 With a singly-replicated block that's corrupted, issuing a read against it 
 twice in succession (e.g. if ChecksumException is caught by the client) gives 
 a NullPointerException.
 Here's the body of a test that reproduces the problem:
 {code}
 final short REPL_FACTOR = 1;
 final long FILE_LENGTH = 512L;
 cluster.waitActive();
 FileSystem fs = cluster.getFileSystem();
 Path path = new Path(/corrupted);
 DFSTestUtil.createFile(fs, path, FILE_LENGTH, REPL_FACTOR, 12345L);
 DFSTestUtil.waitReplication(fs, path, REPL_FACTOR);
 ExtendedBlock block = DFSTestUtil.getFirstBlock(fs, path);
 int blockFilesCorrupted = cluster.corruptBlockOnDataNodes(block);
 assertEquals(All replicas not corrupted, REPL_FACTOR, 
 blockFilesCorrupted);
 InetSocketAddress nnAddr =
 new InetSocketAddress(localhost, cluster.getNameNodePort());
 DFSClient client = new DFSClient(nnAddr, conf);
 DFSInputStream dis = client.open(path.toString());
 byte[] arr = new byte[(int)FILE_LENGTH];
 boolean sawException = false;
 try {
   dis.read(arr, 0, (int)FILE_LENGTH);
 } catch (ChecksumException ex) { 
   sawException = true;
 }
 
 assertTrue(sawException);
 sawException = false;
 try {
   dis.read(arr, 0, (int)FILE_LENGTH); // -- NPE thrown here
 } catch (ChecksumException ex) { 
   sawException = true;
 } 
 {code}
 The stack:
 {code}
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:492)
   at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:545)
 [snip test stack]
 {code}
 and the problem is that currentNode is null. It's left at null after the 
 first read, which fails, and then is never refreshed because the condition in 
 read that protects blockSeekTo is only triggered if the current position is 
 outside the block's range. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3097) Use 'exec' to invoke catalina.sh in HttpFS's httpfs.sh

2012-03-15 Thread Herman Chen (Created) (JIRA)

Use 'exec' to invoke catalina.sh in HttpFS's httpfs.sh
--

 Key: HDFS-3097
 URL: https://issues.apache.org/jira/browse/HDFS-3097
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.23.1
Reporter: Herman Chen


Without it shell spawns a new process, which defeats the purpose when you would 
like to run it in the foreground with httpfs.sh run, which eventually invokes 
catalina.sh run.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3098) Update FsShell tests for quoted metachars

2012-03-15 Thread Daryn Sharp (Created) (JIRA)

Update FsShell tests for quoted metachars
-

 Key: HDFS-3098
 URL: https://issues.apache.org/jira/browse/HDFS-3098
 Project: Hadoop HDFS
  Issue Type: Test
  Components: test
Affects Versions: 0.24.0, 0.23.2
Reporter: Daryn Sharp
Assignee: Daryn Sharp


Need to add tests to TestDFSShell for quoted metachars.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HDFS-3095) Namenode format should not create the storage directory if it doesn't exist

2012-03-15 Thread Brandon Li (Resolved) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li resolved HDFS-3095.
--

  Resolution: Invalid
Release Note: 
Todd made a point here. 

HDFS user should not have write permission beyond mount points. 

 Namenode format should not create the storage directory if it doesn't exist 
 

 Key: HDFS-3095
 URL: https://issues.apache.org/jira/browse/HDFS-3095
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.24.0, 1.1.0
Reporter: Brandon Li
Assignee: Brandon Li

 The storage directory can be a mount point. 
 Automatically creating the mount point could be problematic. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3087) Decomissioning on NN restart can complete without blocks being replicated

2012-03-15 Thread Suresh Srinivas (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-3087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230357#comment-13230357
]

Suresh Srinivas commented on HDFS-3087:
---

Kihwal, this is a good bug find. We should fix this.

This problem is not that serious. Prior to 0.23, we shutdown the datanode post
decommission completed. After HDFS-1547 we do not shutdown the DN any more. The
DN continues to shown as decommissioned. The expectation is, an Admin can at a
later time shutdown the decommissioned DNs and proceed with maintenance of the
node. Given this the question is, after we mark DN as decommissioned, when
block report comes in, what happens? I suspect we moving back to decom in
progress.

How about using the flag that DatanodeDescriptor has for tracking first block
report. We should not mark a DN as decommissioned, if block report is not
received. I also agree that we should not be marking any thing as
decommissioned, until we come out of safemode.

Decomissioning on NN restart can complete without blocks being replicated
-

Key: HDFS-3087
URL: https://issues.apache.org/jira/browse/HDFS-3087
Project: Hadoop HDFS
Issue Type: Bug
Components: name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: Kihwal Lee
Assignee: Kihwal Lee
Priority: Critical
Fix For: 0.23.0, 0.24.0, 0.23.2, 0.23.3

If a data node is added to the exclude list and the name node is restarted,
the decomissioning happens right away on the data node registration. At this
point the initial block report has not been sent, so the name node thinks the
node has zero blocks and the decomissioning completes very quick, without
replicating the blocks on that node.

[jira] [Created] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system

2012-03-15 Thread Aaron T. Myers (Created) (JIRA)

SecondaryNameNode does not properly initialize metrics system
-

 Key: HDFS-3099
 URL: https://issues.apache.org/jira/browse/HDFS-3099
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.2
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers


The SecondaryNameNode is not properly initializing its metrics system. This 
results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being 
output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system


 [ 
https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3099:
-

Status: Patch Available  (was: Open)

 SecondaryNameNode does not properly initialize metrics system
 -

 Key: HDFS-3099
 URL: https://issues.apache.org/jira/browse/HDFS-3099
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.2
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Attachments: HDFS-3099.patch


 The SecondaryNameNode is not properly initializing its metrics system. This 
 results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being 
 output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system

2012-03-15 Thread Tsz Wo (Nicholas), SZE (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3099:
-

Attachment: HDFS-3099.patch

Here's a trivial patched which fixes the issue. I tested this manually by 
starting up a 2NN and browsing to /jmx. I confirmed that the expected metrics 
do appear with this patch, where as they do not without it.

 SecondaryNameNode does not properly initialize metrics system
 -

 Key: HDFS-3099
 URL: https://issues.apache.org/jira/browse/HDFS-3099
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.2
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Attachments: HDFS-3099.patch


 The SecondaryNameNode is not properly initializing its metrics system. This 
 results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being 
 output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1601) Pipeline ACKs are sent as lots of tiny TCP packets

2012-03-15 Thread Konstantin Shvachko (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-1601:
--

Fix Version/s: 0.22.1

 Pipeline ACKs are sent as lots of tiny TCP packets
 --

 Key: HDFS-1601
 URL: https://issues.apache.org/jira/browse/HDFS-1601
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0, 0.22.1

 Attachments: hdfs-1601-22.txt, hdfs-1601.txt, hdfs-1601.txt


 I noticed in an hbase benchmark that the packet counts in my network 
 monitoring seemed high, so took a short pcap trace and found that each 
 pipeline ACK was being sent as five packets, the first four of which only 
 contain one byte. We should buffer these bytes and send the PipelineAck as 
 one TCP packet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3005) ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..)


 [ 
https://issues.apache.org/jira/browse/HDFS-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3005:
-

   Resolution: Fixed
Fix Version/s: 0.23.3
   0.24.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I have committed this to trunk and 0.23.

 ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..)
 

 Key: HDFS-3005
 URL: https://issues.apache.org/jira/browse/HDFS-3005
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.24.0
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0, 0.23.3

 Attachments: HDFS-3005.patch, h3005_20120312.patch, 
 h3005_20120314.patch, h3005_20120314b.patch


 Saw this in [build 
 #1888|https://builds.apache.org/job/PreCommit-HDFS-Build/1888//testReport/org.apache.hadoop.hdfs.server.datanode/TestMulitipleNNDataBlockScanner/testBlockScannerAfterRestart/].
 {noformat}
 java.util.ConcurrentModificationException
   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
   at java.util.HashMap$EntryIterator.next(HashMap.java:834)
   at java.util.HashMap$EntryIterator.next(HashMap.java:832)
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.getDfsUsed(FSDataset.java:557)
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.getDfsUsed(FSDataset.java:809)
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.access$1400(FSDataset.java:774)
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset.getDfsUsed(FSDataset.java:1124)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.sendHeartBeat(BPOfferService.java:406)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.offerService(BPOfferService.java:490)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.run(BPOfferService.java:635)
   at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3005) ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..)


[ 
https://issues.apache.org/jira/browse/HDFS-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230404#comment-13230404
 ] 

Hudson commented on HDFS-3005:
--

Integrated in Hadoop-Common-trunk-Commit #1878 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1878/])
HDFS-3005. FSVolume.decDfsUsed(..) should be synchronized. (Revision 
1301127)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301127
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java


 ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..)
 

 Key: HDFS-3005
 URL: https://issues.apache.org/jira/browse/HDFS-3005
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.24.0
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0, 0.23.3

 Attachments: HDFS-3005.patch, h3005_20120312.patch, 
 h3005_20120314.patch, h3005_20120314b.patch


 Saw this in [build 
 #1888|https://builds.apache.org/job/PreCommit-HDFS-Build/1888//testReport/org.apache.hadoop.hdfs.server.datanode/TestMulitipleNNDataBlockScanner/testBlockScannerAfterRestart/].
 {noformat}
 java.util.ConcurrentModificationException
   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
   at java.util.HashMap$EntryIterator.next(HashMap.java:834)
   at java.util.HashMap$EntryIterator.next(HashMap.java:832)
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.getDfsUsed(FSDataset.java:557)
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.getDfsUsed(FSDataset.java:809)
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.access$1400(FSDataset.java:774)
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset.getDfsUsed(FSDataset.java:1124)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.sendHeartBeat(BPOfferService.java:406)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.offerService(BPOfferService.java:490)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.run(BPOfferService.java:635)
   at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3005) ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..)


[ 
https://issues.apache.org/jira/browse/HDFS-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230409#comment-13230409
 ] 

Hudson commented on HDFS-3005:
--

Integrated in Hadoop-Common-0.23-Commit #684 (See 
[https://builds.apache.org/job/Hadoop-Common-0.23-Commit/684/])
svn merge -c 1301127 from trunk for HDFS-3005. (Revision 1301130)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301130
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java


 ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..)
 

 Key: HDFS-3005
 URL: https://issues.apache.org/jira/browse/HDFS-3005
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.24.0
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0, 0.23.3

 Attachments: HDFS-3005.patch, h3005_20120312.patch, 
 h3005_20120314.patch, h3005_20120314b.patch


 Saw this in [build 
 #1888|https://builds.apache.org/job/PreCommit-HDFS-Build/1888//testReport/org.apache.hadoop.hdfs.server.datanode/TestMulitipleNNDataBlockScanner/testBlockScannerAfterRestart/].
 {noformat}
 java.util.ConcurrentModificationException
   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
   at java.util.HashMap$EntryIterator.next(HashMap.java:834)
   at java.util.HashMap$EntryIterator.next(HashMap.java:832)
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.getDfsUsed(FSDataset.java:557)
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.getDfsUsed(FSDataset.java:809)
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.access$1400(FSDataset.java:774)
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset.getDfsUsed(FSDataset.java:1124)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.sendHeartBeat(BPOfferService.java:406)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.offerService(BPOfferService.java:490)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.run(BPOfferService.java:635)
   at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3005) ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..)


[ 
https://issues.apache.org/jira/browse/HDFS-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230411#comment-13230411
 ] 

Hudson commented on HDFS-3005:
--

Integrated in Hadoop-Hdfs-0.23-Commit #675 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Commit/675/])
svn merge -c 1301127 from trunk for HDFS-3005. (Revision 1301130)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301130
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java


 ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..)
 

 Key: HDFS-3005
 URL: https://issues.apache.org/jira/browse/HDFS-3005
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.24.0
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0, 0.23.3

 Attachments: HDFS-3005.patch, h3005_20120312.patch, 
 h3005_20120314.patch, h3005_20120314b.patch


 Saw this in [build 
 #1888|https://builds.apache.org/job/PreCommit-HDFS-Build/1888//testReport/org.apache.hadoop.hdfs.server.datanode/TestMulitipleNNDataBlockScanner/testBlockScannerAfterRestart/].
 {noformat}
 java.util.ConcurrentModificationException
   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
   at java.util.HashMap$EntryIterator.next(HashMap.java:834)
   at java.util.HashMap$EntryIterator.next(HashMap.java:832)
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.getDfsUsed(FSDataset.java:557)
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.getDfsUsed(FSDataset.java:809)
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.access$1400(FSDataset.java:774)
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset.getDfsUsed(FSDataset.java:1124)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.sendHeartBeat(BPOfferService.java:406)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.offerService(BPOfferService.java:490)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.run(BPOfferService.java:635)
   at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3004) Implement Recovery Mode

2012-03-15 Thread Colin Patrick McCabe (Updated) (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Colin Patrick McCabe updated HDFS-3004:
---

Attachment: HDFS-3004.012.patch

* make more exceptions skippable

* rename StartupOption.ALWAYS_CHOOSE_YES to StartupOption.ALWAYS_CHOOSE_FIRST,
to better reflect what it does.

* refactor EditLogInputStream a bit

Implement Recovery Mode
---

Key: HDFS-3004
URL: https://issues.apache.org/jira/browse/HDFS-3004
Project: Hadoop HDFS
Issue Type: New Feature
Components: tools
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Attachments: HDFS-3004.010.patch, HDFS-3004.011.patch,
HDFS-3004.012.patch, HDFS-3004__namenode_recovery_tool.txt

When the NameNode metadata is corrupt for some reason, we want to be able to
fix it. Obviously, we would prefer never to get in this case. In a perfect
world, we never would. However, bad data on disk can happen from time to
time, because of hardware errors or misconfigurations. In the past we have
had to correct it manually, which is time-consuming and which can result in
downtime.
Recovery mode is initialized by the system administrator. When the NameNode
starts up in Recovery Mode, it will try to load the FSImage file, apply all
the edits from the edits log, and then write out a new image. Then it will
shut down.
Unlike in the normal startup process, the recovery mode startup process will
be interactive. When the NameNode finds something that is inconsistent, it
will prompt the operator as to what it should do. The operator can also
choose to take the first option for all prompts by starting up with the '-f'
flag, or typing 'a' at one of the prompts.
I have reused as much code as possible from the NameNode in this tool.
Hopefully, the effort that was spent developing this will also make the
NameNode editLog and image processing even more robust than it already is.

[jira] [Commented] (HDFS-3005) ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..)


[ 
https://issues.apache.org/jira/browse/HDFS-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230413#comment-13230413
 ] 

Hudson commented on HDFS-3005:
--

Integrated in Hadoop-Hdfs-trunk-Commit #1953 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1953/])
HDFS-3005. FSVolume.decDfsUsed(..) should be synchronized. (Revision 
1301127)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301127
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java


 ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..)
 

 Key: HDFS-3005
 URL: https://issues.apache.org/jira/browse/HDFS-3005
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.24.0
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0, 0.23.3

 Attachments: HDFS-3005.patch, h3005_20120312.patch, 
 h3005_20120314.patch, h3005_20120314b.patch


 Saw this in [build 
 #1888|https://builds.apache.org/job/PreCommit-HDFS-Build/1888//testReport/org.apache.hadoop.hdfs.server.datanode/TestMulitipleNNDataBlockScanner/testBlockScannerAfterRestart/].
 {noformat}
 java.util.ConcurrentModificationException
   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
   at java.util.HashMap$EntryIterator.next(HashMap.java:834)
   at java.util.HashMap$EntryIterator.next(HashMap.java:832)
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.getDfsUsed(FSDataset.java:557)
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.getDfsUsed(FSDataset.java:809)
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.access$1400(FSDataset.java:774)
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset.getDfsUsed(FSDataset.java:1124)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.sendHeartBeat(BPOfferService.java:406)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.offerService(BPOfferService.java:490)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.run(BPOfferService.java:635)
   at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system

2012-03-15 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230418#comment-13230418
 ] 

Todd Lipcon commented on HDFS-3099:
---

Any chance you could add a simple test case in TestSecondaryWebUi? Should be 
only a few lines.

 SecondaryNameNode does not properly initialize metrics system
 -

 Key: HDFS-3099
 URL: https://issues.apache.org/jira/browse/HDFS-3099
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.2
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Attachments: HDFS-3099.patch


 The SecondaryNameNode is not properly initializing its metrics system. This 
 results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being 
 output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3067) NPE in DFSInputStream.readBuffer if read is repeated on corrupted block

2012-03-15 Thread Aaron T. Myers (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230419#comment-13230419
 ] 

Aaron T. Myers commented on HDFS-3067:
--

I bet the test failure of TestDatanodeBlockScanner is simply HDFS-2881. I've 
just kicked Jenkins for this patch again to see if we can get a clean run. I 
agree that the TestHDFSCli failure is unrelated.

 NPE in DFSInputStream.readBuffer if read is repeated on corrupted block
 ---

 Key: HDFS-3067
 URL: https://issues.apache.org/jira/browse/HDFS-3067
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.24.0
Reporter: Henry Robinson
Assignee: Henry Robinson
 Attachments: HDFS-3067.1.patch, HDFS-3607.patch


 With a singly-replicated block that's corrupted, issuing a read against it 
 twice in succession (e.g. if ChecksumException is caught by the client) gives 
 a NullPointerException.
 Here's the body of a test that reproduces the problem:
 {code}
 final short REPL_FACTOR = 1;
 final long FILE_LENGTH = 512L;
 cluster.waitActive();
 FileSystem fs = cluster.getFileSystem();
 Path path = new Path(/corrupted);
 DFSTestUtil.createFile(fs, path, FILE_LENGTH, REPL_FACTOR, 12345L);
 DFSTestUtil.waitReplication(fs, path, REPL_FACTOR);
 ExtendedBlock block = DFSTestUtil.getFirstBlock(fs, path);
 int blockFilesCorrupted = cluster.corruptBlockOnDataNodes(block);
 assertEquals(All replicas not corrupted, REPL_FACTOR, 
 blockFilesCorrupted);
 InetSocketAddress nnAddr =
 new InetSocketAddress(localhost, cluster.getNameNodePort());
 DFSClient client = new DFSClient(nnAddr, conf);
 DFSInputStream dis = client.open(path.toString());
 byte[] arr = new byte[(int)FILE_LENGTH];
 boolean sawException = false;
 try {
   dis.read(arr, 0, (int)FILE_LENGTH);
 } catch (ChecksumException ex) { 
   sawException = true;
 }
 
 assertTrue(sawException);
 sawException = false;
 try {
   dis.read(arr, 0, (int)FILE_LENGTH); // -- NPE thrown here
 } catch (ChecksumException ex) { 
   sawException = true;
 } 
 {code}
 The stack:
 {code}
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:492)
   at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:545)
 [snip test stack]
 {code}
 and the problem is that currentNode is null. It's left at null after the 
 first read, which fails, and then is never refreshed because the condition in 
 read that protects blockSeekTo is only triggered if the current position is 
 outside the block's range. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3091) Failed to add new DataNode in pipeline and will be resulted into write failure.


[ 
https://issues.apache.org/jira/browse/HDFS-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230426#comment-13230426
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-3091:
--

Hi Uma,

First of all, thanks for testing it.

I would say the failures are expected.  The feature is to guarantee the number 
of replicas that the user is asking.  However, the cluster is too small that 
the guarantee is impossible.  It makes sense to fail the write requests.

Note that policy is a client side configuration.  The user could set the policy 
to NEVER.  For the 3-node case, the admin should disable the feature (or set 
the policy to NEVER in the default conf.)

Does it make sense to you?

 Failed to add new DataNode in pipeline and will be resulted into write 
 failure.
 ---

 Key: HDFS-3091
 URL: https://issues.apache.org/jira/browse/HDFS-3091
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node, hdfs client, name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: Uma Maheswara Rao G

 When verifying the HDFS-1606 feature, Observed couple of issues.
 Presently the ReplaceDatanodeOnFailure policy satisfies even though we dont 
 have enough DN to replcae in cluster and will be resulted into write failure.
 {quote}
 12/03/13 14:27:12 WARN hdfs.DFSClient: DataStreamer Exception
 java.io.IOException: Failed to add a datanode: nodes.length != 
 original.length + 1, nodes=[10.18.52.55:50010], original=[10.18.52.55:50010]
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:778)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:834)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:930)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:741)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:416)
 {quote}
 Lets take some cases:
 1) Replication factor 3 and cluster size also 3 and unportunately pipeline 
 drops to 1.
 ReplaceDatanodeOnFailure will be satisfied because *existings(1)= 
 replication/2 (3/2==1)*.
 But when it finding the new node to replace obiously it can not find the new 
 node and the sanity check will fail.
 This will be resulted to Wite failure.
 2) Replication factor 10 (accidentally user sets the replication factor to 
 higher value than cluster size),
   Cluser has only 5 datanodes.
   Here even if one node fails also write will fail with same reason.
   Because pipeline max will be 5 and killed one datanode, then existings will 
 be 4
   *existings(4)= replication/2(10/2==5)* will be satisfied and obiously it 
 can not replace with the new node as there is no extra nodes exist in the 
 cluster. This will be resulted to write failure.
 3) sync realted opreations also fails in this situations ( will post the 
 clear scenarios)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs

2012-03-15 Thread Andrew Purtell (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230432#comment-13230432
]

Andrew Purtell commented on HDFS-3077:
--

From a user perspective.

bq. [Todd] I think a quorum commit is vastly superior for HA, especially given
we'd like to collocate the log replicas on machines doing other work. When
those machines have latency hiccups, or crash, we don't want the active NN to
have to wait for long timeout periods before continuing.

I think this is a promising direction. See next:

bq. [Eli] BK has two of the same main issues that we have depending on a an HA
filer: (1) many users don't want to admin a separate storage system (even if
you embed BK it will be discrete, fail independently etc)

Perhaps we can go so far as to suggest the loggers be an additional thread
added to the DataNodes. Perhaps some subset of the DN pool is elected for the
purpose. (Need we waste a whole disk just for the transaction log? Maybe the
log can be shared with DN storage. Or using a SSD device for this purpose seems
reasonable but the average user should not be expected to have nodes with such
on hand.) On the one hand, this would increase the internal complexity of the
DataNode implementation, even if the functionality can be pretty well
partitioned -- separate package, separate thread, etc. On the other hand, there
would be not yet another moving part to consider when deploying components
around the cluster: ZooKeeper quorum peers, NameNodes, DataNodes, the YARN AM,
the Yarn NMs, HBase Masters, HBase RegionServers etc. etc. etc.

This idea may go too far, but IMHO embedding BookKeeper goes enough in the
other direction to give me heartburn thinking about HA cluster ops.

Quorum-based protocol for reading and writing edit logs
---

Key: HDFS-3077
URL: https://issues.apache.org/jira/browse/HDFS-3077
Project: Hadoop HDFS
Issue Type: New Feature
Components: ha, name-node
Reporter: Todd Lipcon
Assignee: Todd Lipcon

Currently, one of the weak points of the HA design is that it relies on
shared storage such as an NFS filer for the shared edit log. One alternative
that has been proposed is to depend on BookKeeper, a ZooKeeper subproject
which provides a highly available replicated edit log on commodity hardware.
This JIRA is to implement another alternative, based on a quorum commit
protocol, integrated more tightly in HDFS and with the requirements driven
only by HDFS's needs rather than more generic use cases. More details to
follow.

[jira] [Created] (HDFS-3100) failed to append data using webhdfs

2012-03-15 Thread Zhanwei.Wang (Created) (JIRA)

failed to append data using webhdfs
---

 Key: HDFS-3100
 URL: https://issues.apache.org/jira/browse/HDFS-3100
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.23.1
Reporter: Zhanwei.Wang



STEP：
1, deploy a single node hdfs  0.23.1 cluster and configure hdfs as:
A) enable webhdfs
B) enable append
C) disable permissions
2, start hdfs
3, run the test script as attached

RESULT:
expected: a file named testFile should be created and populated with 32K * 5000 
zeros, HDFS should be OK.
I got: script cannot be finished, file has been created but not be populated as 
expected, actually append operation failed.

Datanode log shows that, blockscaner report a bad replica and nanenode decide 
to delete it. Since it is a single node cluster, append fail. It makes no sense 
that the script failed every time.

Datanode and Namenode logs are attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3100) failed to append data using webhdfs

2012-03-15 Thread Zhanwei.Wang (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhanwei.Wang updated HDFS-3100:
---

Attachment: hadoop-wangzw-namenode-ubuntu.log
hadoop-wangzw-datanode-ubuntu.log
test.sh

test script and logs

 failed to append data using webhdfs
 ---

 Key: HDFS-3100
 URL: https://issues.apache.org/jira/browse/HDFS-3100
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.23.1
Reporter: Zhanwei.Wang
 Attachments: hadoop-wangzw-datanode-ubuntu.log, 
 hadoop-wangzw-namenode-ubuntu.log, test.sh


 STEP：
 1, deploy a single node hdfs  0.23.1 cluster and configure hdfs as:
 A) enable webhdfs
 B) enable append
 C) disable permissions
 2, start hdfs
 3, run the test script as attached
 RESULT:
 expected: a file named testFile should be created and populated with 32K * 
 5000 zeros, HDFS should be OK.
 I got: script cannot be finished, file has been created but not be populated 
 as expected, actually append operation failed.
 Datanode log shows that, blockscaner report a bad replica and nanenode decide 
 to delete it. Since it is a single node cluster, append fail. It makes no 
 sense that the script failed every time.
 Datanode and Namenode logs are attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3101) cannot read empty file using webhdfs

2012-03-15 Thread Zhanwei.Wang (Created) (JIRA)

cannot read empty file using webhdfs


 Key: HDFS-3101
 URL: https://issues.apache.org/jira/browse/HDFS-3101
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.23.1
Reporter: Zhanwei.Wang


STEP:
1, create a new EMPTY file
2, read it using webhdfs.

RESULT:
expected: get a empty file
I got: 
{RemoteException:{exception:IOException,javaClassName:java.io.IOException,message:Offset=0
 out of the range [0, 0); OPEN, path=/testFile}}

First of all, [0, 0) is not a valid range, and I think read a empty file should 
be OK.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HDFS-3101) cannot read empty file using webhdfs

2012-03-15 Thread Tsz Wo (Nicholas), SZE (Assigned) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE reassigned HDFS-3101:


Assignee: Tsz Wo (Nicholas), SZE

 cannot read empty file using webhdfs
 

 Key: HDFS-3101
 URL: https://issues.apache.org/jira/browse/HDFS-3101
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.23.1
Reporter: Zhanwei.Wang
Assignee: Tsz Wo (Nicholas), SZE

 STEP:
 1, create a new EMPTY file
 2, read it using webhdfs.
 RESULT:
 expected: get a empty file
 I got: 
 {RemoteException:{exception:IOException,javaClassName:java.io.IOException,message:Offset=0
  out of the range [0, 0); OPEN, path=/testFile}}
 First of all, [0, 0) is not a valid range, and I think read a empty file 
 should be OK.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3005) ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..)


[ 
https://issues.apache.org/jira/browse/HDFS-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230440#comment-13230440
 ] 

Hudson commented on HDFS-3005:
--

Integrated in Hadoop-Mapreduce-0.23-Commit #692 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/692/])
svn merge -c 1301127 from trunk for HDFS-3005. (Revision 1301130)

 Result = ABORTED
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301130
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java


 ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..)
 

 Key: HDFS-3005
 URL: https://issues.apache.org/jira/browse/HDFS-3005
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.24.0
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0, 0.23.3

 Attachments: HDFS-3005.patch, h3005_20120312.patch, 
 h3005_20120314.patch, h3005_20120314b.patch


 Saw this in [build 
 #1888|https://builds.apache.org/job/PreCommit-HDFS-Build/1888//testReport/org.apache.hadoop.hdfs.server.datanode/TestMulitipleNNDataBlockScanner/testBlockScannerAfterRestart/].
 {noformat}
 java.util.ConcurrentModificationException
   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
   at java.util.HashMap$EntryIterator.next(HashMap.java:834)
   at java.util.HashMap$EntryIterator.next(HashMap.java:832)
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.getDfsUsed(FSDataset.java:557)
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.getDfsUsed(FSDataset.java:809)
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.access$1400(FSDataset.java:774)
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset.getDfsUsed(FSDataset.java:1124)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.sendHeartBeat(BPOfferService.java:406)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.offerService(BPOfferService.java:490)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.run(BPOfferService.java:635)
   at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3005) ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..)

2012-03-15 Thread Tsz Wo (Nicholas), SZE (Assigned) (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230439#comment-13230439
 ] 

Hudson commented on HDFS-3005:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #1887 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1887/])
HDFS-3005. FSVolume.decDfsUsed(..) should be synchronized. (Revision 
1301127)

 Result = ABORTED
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301127
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java


 ConcurrentModificationException in FSDataset$FSVolume.getDfsUsed(..)
 

 Key: HDFS-3005
 URL: https://issues.apache.org/jira/browse/HDFS-3005
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.24.0
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0, 0.23.3

 Attachments: HDFS-3005.patch, h3005_20120312.patch, 
 h3005_20120314.patch, h3005_20120314b.patch


 Saw this in [build 
 #1888|https://builds.apache.org/job/PreCommit-HDFS-Build/1888//testReport/org.apache.hadoop.hdfs.server.datanode/TestMulitipleNNDataBlockScanner/testBlockScannerAfterRestart/].
 {noformat}
 java.util.ConcurrentModificationException
   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
   at java.util.HashMap$EntryIterator.next(HashMap.java:834)
   at java.util.HashMap$EntryIterator.next(HashMap.java:832)
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.getDfsUsed(FSDataset.java:557)
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.getDfsUsed(FSDataset.java:809)
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.access$1400(FSDataset.java:774)
   at 
 org.apache.hadoop.hdfs.server.datanode.FSDataset.getDfsUsed(FSDataset.java:1124)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.sendHeartBeat(BPOfferService.java:406)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.offerService(BPOfferService.java:490)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPOfferService.run(BPOfferService.java:635)
   at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HDFS-3100) failed to append data using webhdfs


 [ 
https://issues.apache.org/jira/browse/HDFS-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE reassigned HDFS-3100:


Assignee: Tsz Wo (Nicholas), SZE

 failed to append data using webhdfs
 ---

 Key: HDFS-3100
 URL: https://issues.apache.org/jira/browse/HDFS-3100
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.23.1
Reporter: Zhanwei.Wang
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: hadoop-wangzw-datanode-ubuntu.log, 
 hadoop-wangzw-namenode-ubuntu.log, test.sh


 STEP：
 1, deploy a single node hdfs  0.23.1 cluster and configure hdfs as:
 A) enable webhdfs
 B) enable append
 C) disable permissions
 2, start hdfs
 3, run the test script as attached
 RESULT:
 expected: a file named testFile should be created and populated with 32K * 
 5000 zeros, HDFS should be OK.
 I got: script cannot be finished, file has been created but not be populated 
 as expected, actually append operation failed.
 Datanode log shows that, blockscaner report a bad replica and nanenode decide 
 to delete it. Since it is a single node cluster, append fail. It makes no 
 sense that the script failed every time.
 Datanode and Namenode logs are attached.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3101) cannot read empty file using webhdfs

2012-03-15 Thread Tsz Wo (Nicholas), SZE (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3101:
-

Attachment: h3101_20120315.patch

Hi Zhanwei,

Good catch.  Thanks a lot for filing this bug.  Here is a patch

h3101_20120315.patch: allow reading on zero size file.

Would you mind also testing the patch?

 cannot read empty file using webhdfs
 

 Key: HDFS-3101
 URL: https://issues.apache.org/jira/browse/HDFS-3101
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.23.1
Reporter: Zhanwei.Wang
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h3101_20120315.patch


 STEP:
 1, create a new EMPTY file
 2, read it using webhdfs.
 RESULT:
 expected: get a empty file
 I got: 
 {RemoteException:{exception:IOException,javaClassName:java.io.IOException,message:Offset=0
  out of the range [0, 0); OPEN, path=/testFile}}
 First of all, [0, 0) is not a valid range, and I think read a empty file 
 should be OK.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3101) cannot read empty file using webhdfs

2012-03-15 Thread Tsz Wo (Nicholas), SZE (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3101:
-

Status: Patch Available  (was: Open)

 cannot read empty file using webhdfs
 

 Key: HDFS-3101
 URL: https://issues.apache.org/jira/browse/HDFS-3101
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.23.1
Reporter: Zhanwei.Wang
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h3101_20120315.patch


 STEP:
 1, create a new EMPTY file
 2, read it using webhdfs.
 RESULT:
 expected: get a empty file
 I got: 
 {RemoteException:{exception:IOException,javaClassName:java.io.IOException,message:Offset=0
  out of the range [0, 0); OPEN, path=/testFile}}
 First of all, [0, 0) is not a valid range, and I think read a empty file 
 should be OK.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3098) Update FsShell tests for quoted metachars


 [ 
https://issues.apache.org/jira/browse/HDFS-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-3098:
--

Attachment: HDFS-3098.patch

Add tests to ensure quoted metas are taken literally. List directories with *s. 
Create dir with a * subdir  regular subdir. Delete the * subdir. Ensure other 
subdir was not caught by a glob and still exists.

 Update FsShell tests for quoted metachars
 -

 Key: HDFS-3098
 URL: https://issues.apache.org/jira/browse/HDFS-3098
 Project: Hadoop HDFS
  Issue Type: Test
  Components: test
Affects Versions: 0.24.0, 0.23.2
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Attachments: HDFS-3098.patch


 Need to add tests to TestDFSShell for quoted metachars.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3098) Update FsShell tests for quoted metachars


 [ 
https://issues.apache.org/jira/browse/HDFS-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-3098:
--

Target Version/s: 0.24.0, 0.23.2  (was: 0.23.2, 0.24.0)
  Status: Patch Available  (was: Open)

 Update FsShell tests for quoted metachars
 -

 Key: HDFS-3098
 URL: https://issues.apache.org/jira/browse/HDFS-3098
 Project: Hadoop HDFS
  Issue Type: Test
  Components: test
Affects Versions: 0.24.0, 0.23.2
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Attachments: HDFS-3098.patch


 Need to add tests to TestDFSShell for quoted metachars.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system


 [ 
https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3099:
-

Attachment: HDFS-3099.patch

Here's another patch which just adds a simple test case.

 SecondaryNameNode does not properly initialize metrics system
 -

 Key: HDFS-3099
 URL: https://issues.apache.org/jira/browse/HDFS-3099
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.2
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Attachments: HDFS-3099.patch, HDFS-3099.patch


 The SecondaryNameNode is not properly initializing its metrics system. This 
 results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being 
 output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system


[ 
https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230499#comment-13230499
 ] 

Todd Lipcon commented on HDFS-3099:
---

You're going to slap me for making you do another rev on this, but: can you 
change the @Before to a @BeforeClass, so that we only use one minicluster here 
instead of one per case?

 SecondaryNameNode does not properly initialize metrics system
 -

 Key: HDFS-3099
 URL: https://issues.apache.org/jira/browse/HDFS-3099
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.2
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Attachments: HDFS-3099.patch, HDFS-3099.patch


 The SecondaryNameNode is not properly initializing its metrics system. This 
 results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being 
 output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3094) add -nonInteractive and -force option to namenode -format command


 [ 
https://issues.apache.org/jira/browse/HDFS-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-3094:
--

Attachment: HDFS-3094.docs.patch
HDFS-3094.branch-1.0.patch

updated documentation for branch 1.0,
attached a patch for trunk doc as its in common.

 add -nonInteractive and -force option to namenode -format command
 -

 Key: HDFS-3094
 URL: https://issues.apache.org/jira/browse/HDFS-3094
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.24.0, 1.0.2
Reporter: Arpit Gupta
Assignee: Arpit Gupta
 Attachments: HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, 
 HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, HDFS-3094.docs.patch


 Currently the bin/hadoop namenode -format prompts the user for a Y/N to setup 
 the directories in the local file system.
 -force : namenode formats the directories without prompting
 -nonInterActive : namenode format will return with an exit code of 1 if the 
 dir exists.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3094) add -nonInteractive and -force option to namenode -format command


 [ 
https://issues.apache.org/jira/browse/HDFS-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-3094:
--

Status: Patch Available  (was: Open)

 add -nonInteractive and -force option to namenode -format command
 -

 Key: HDFS-3094
 URL: https://issues.apache.org/jira/browse/HDFS-3094
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.24.0, 1.0.2
Reporter: Arpit Gupta
Assignee: Arpit Gupta
 Attachments: HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, 
 HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, HDFS-3094.docs.patch, 
 HDFS-3094.patch


 Currently the bin/hadoop namenode -format prompts the user for a Y/N to setup 
 the directories in the local file system.
 -force : namenode formats the directories without prompting
 -nonInterActive : namenode format will return with an exit code of 1 if the 
 dir exists.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3094) add -nonInteractive and -force option to namenode -format command


 [ 
https://issues.apache.org/jira/browse/HDFS-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-3094:
--

Attachment: HDFS-3094.patch

patch for trunk, the docs patch is in a separate file HDFS-3094.docs.patch as 
that is in common.

 add -nonInteractive and -force option to namenode -format command
 -

 Key: HDFS-3094
 URL: https://issues.apache.org/jira/browse/HDFS-3094
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.24.0, 1.0.2
Reporter: Arpit Gupta
Assignee: Arpit Gupta
 Attachments: HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, 
 HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, HDFS-3094.docs.patch, 
 HDFS-3094.patch


 Currently the bin/hadoop namenode -format prompts the user for a Y/N to setup 
 the directories in the local file system.
 -force : namenode formats the directories without prompting
 -nonInterActive : namenode format will return with an exit code of 1 if the 
 dir exists.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3067) NPE in DFSInputStream.readBuffer if read is repeated on corrupted block


[ 
https://issues.apache.org/jira/browse/HDFS-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230509#comment-13230509
 ] 

Hadoop QA commented on HDFS-3067:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12517987/HDFS-3067.1.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests:
  org.apache.hadoop.cli.TestHDFSCLI

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2010//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2010//console

This message is automatically generated.

 NPE in DFSInputStream.readBuffer if read is repeated on corrupted block
 ---

 Key: HDFS-3067
 URL: https://issues.apache.org/jira/browse/HDFS-3067
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.24.0
Reporter: Henry Robinson
Assignee: Henry Robinson
 Attachments: HDFS-3067.1.patch, HDFS-3607.patch


 With a singly-replicated block that's corrupted, issuing a read against it 
 twice in succession (e.g. if ChecksumException is caught by the client) gives 
 a NullPointerException.
 Here's the body of a test that reproduces the problem:
 {code}
 final short REPL_FACTOR = 1;
 final long FILE_LENGTH = 512L;
 cluster.waitActive();
 FileSystem fs = cluster.getFileSystem();
 Path path = new Path(/corrupted);
 DFSTestUtil.createFile(fs, path, FILE_LENGTH, REPL_FACTOR, 12345L);
 DFSTestUtil.waitReplication(fs, path, REPL_FACTOR);
 ExtendedBlock block = DFSTestUtil.getFirstBlock(fs, path);
 int blockFilesCorrupted = cluster.corruptBlockOnDataNodes(block);
 assertEquals(All replicas not corrupted, REPL_FACTOR, 
 blockFilesCorrupted);
 InetSocketAddress nnAddr =
 new InetSocketAddress(localhost, cluster.getNameNodePort());
 DFSClient client = new DFSClient(nnAddr, conf);
 DFSInputStream dis = client.open(path.toString());
 byte[] arr = new byte[(int)FILE_LENGTH];
 boolean sawException = false;
 try {
   dis.read(arr, 0, (int)FILE_LENGTH);
 } catch (ChecksumException ex) { 
   sawException = true;
 }
 
 assertTrue(sawException);
 sawException = false;
 try {
   dis.read(arr, 0, (int)FILE_LENGTH); // -- NPE thrown here
 } catch (ChecksumException ex) { 
   sawException = true;
 } 
 {code}
 The stack:
 {code}
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:492)
   at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:545)
 [snip test stack]
 {code}
 and the problem is that currentNode is null. It's left at null after the 
 first read, which fails, and then is never refreshed because the condition in 
 read that protects blockSeekTo is only triggered if the current position is 
 outside the block's range. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system

2012-03-15 Thread Tsz Wo (Nicholas), SZE (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3099:
-

Attachment: HDFS-3099.patch

Switch to using before/after class.

 SecondaryNameNode does not properly initialize metrics system
 -

 Key: HDFS-3099
 URL: https://issues.apache.org/jira/browse/HDFS-3099
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.2
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Attachments: HDFS-3099.patch, HDFS-3099.patch, HDFS-3099.patch


 The SecondaryNameNode is not properly initializing its metrics system. This 
 results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being 
 output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3100) failed to append data using webhdfs

[
https://issues.apache.org/jira/browse/HDFS-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tsz Wo (Nicholas), SZE updated HDFS-3100:
-

Attachment: testAppend.patch

Unfortunately, this is not specific to WebHDFS. HDFS also fails with the test.

testAppend.patch: unit tests similar to Zhanwei's script.

failed to append data using webhdfs
---

Key: HDFS-3100
URL: https://issues.apache.org/jira/browse/HDFS-3100
Project: Hadoop HDFS
Issue Type: Bug
Components: data-node
Affects Versions: 0.23.1
Reporter: Zhanwei.Wang
Assignee: Tsz Wo (Nicholas), SZE
Attachments: hadoop-wangzw-datanode-ubuntu.log,
hadoop-wangzw-namenode-ubuntu.log, test.sh, testAppend.patch

STEP：
1, deploy a single node hdfs 0.23.1 cluster and configure hdfs as:
A) enable webhdfs
B) enable append
C) disable permissions
2, start hdfs
3, run the test script as attached
RESULT:
expected: a file named testFile should be created and populated with 32K *
5000 zeros, HDFS should be OK.
I got: script cannot be finished, file has been created but not be populated
as expected, actually append operation failed.
Datanode log shows that, blockscaner report a bad replica and nanenode decide
to delete it. Since it is a single node cluster, append fail. It makes no
sense that the script failed every time.
Datanode and Namenode logs are attached.

[jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs

[
https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230513#comment-13230513
]

Todd Lipcon commented on HDFS-3077:
---

Hey Andrew, thanks for the ops perspective.

The idea of embedding these logger daemons inside others is definitely
something I'm considering. Embedding in DNs is one idea -- the other direction
is to actually have a quorum of NNs, so that when an edit is logged, it is also
applied to the SBN's namespace. But for simplicity on a first cut, I think the
plan is to go with external processes and then figure out where best to embed
them.

Quorum-based protocol for reading and writing edit logs
---

Key: HDFS-3077
URL: https://issues.apache.org/jira/browse/HDFS-3077
Project: Hadoop HDFS
Issue Type: New Feature
Components: ha, name-node
Reporter: Todd Lipcon
Assignee: Todd Lipcon

[jira] [Commented] (HDFS-3067) NPE in DFSInputStream.readBuffer if read is repeated on corrupted block

2012-03-15 Thread Aaron T. Myers (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230514#comment-13230514
 ] 

Aaron T. Myers commented on HDFS-3067:
--

Looks to me like the TestDatanodeBlockScanner failure was indeed unrelated.

+1, the latest patch looks good to me. I'm going to commit this momentarily.

 NPE in DFSInputStream.readBuffer if read is repeated on corrupted block
 ---

 Key: HDFS-3067
 URL: https://issues.apache.org/jira/browse/HDFS-3067
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.24.0
Reporter: Henry Robinson
Assignee: Henry Robinson
 Attachments: HDFS-3067.1.patch, HDFS-3607.patch


 With a singly-replicated block that's corrupted, issuing a read against it 
 twice in succession (e.g. if ChecksumException is caught by the client) gives 
 a NullPointerException.
 Here's the body of a test that reproduces the problem:
 {code}
 final short REPL_FACTOR = 1;
 final long FILE_LENGTH = 512L;
 cluster.waitActive();
 FileSystem fs = cluster.getFileSystem();
 Path path = new Path(/corrupted);
 DFSTestUtil.createFile(fs, path, FILE_LENGTH, REPL_FACTOR, 12345L);
 DFSTestUtil.waitReplication(fs, path, REPL_FACTOR);
 ExtendedBlock block = DFSTestUtil.getFirstBlock(fs, path);
 int blockFilesCorrupted = cluster.corruptBlockOnDataNodes(block);
 assertEquals(All replicas not corrupted, REPL_FACTOR, 
 blockFilesCorrupted);
 InetSocketAddress nnAddr =
 new InetSocketAddress(localhost, cluster.getNameNodePort());
 DFSClient client = new DFSClient(nnAddr, conf);
 DFSInputStream dis = client.open(path.toString());
 byte[] arr = new byte[(int)FILE_LENGTH];
 boolean sawException = false;
 try {
   dis.read(arr, 0, (int)FILE_LENGTH);
 } catch (ChecksumException ex) { 
   sawException = true;
 }
 
 assertTrue(sawException);
 sawException = false;
 try {
   dis.read(arr, 0, (int)FILE_LENGTH); // -- NPE thrown here
 } catch (ChecksumException ex) { 
   sawException = true;
 } 
 {code}
 The stack:
 {code}
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:492)
   at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:545)
 [snip test stack]
 {code}
 and the problem is that currentNode is null. It's left at null after the 
 first read, which fails, and then is never refreshed because the condition in 
 read that protects blockSeekTo is only triggered if the current position is 
 outside the block's range. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3067) NPE in DFSInputStream.readBuffer if read is repeated on corrupted block


 [ 
https://issues.apache.org/jira/browse/HDFS-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3067:
-

   Resolution: Fixed
Fix Version/s: 0.24.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I've just committed this to trunk. Thanks a lot for the contribution, Hank.

 NPE in DFSInputStream.readBuffer if read is repeated on corrupted block
 ---

 Key: HDFS-3067
 URL: https://issues.apache.org/jira/browse/HDFS-3067
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.24.0
Reporter: Henry Robinson
Assignee: Henry Robinson
 Fix For: 0.24.0

 Attachments: HDFS-3067.1.patch, HDFS-3607.patch


 With a singly-replicated block that's corrupted, issuing a read against it 
 twice in succession (e.g. if ChecksumException is caught by the client) gives 
 a NullPointerException.
 Here's the body of a test that reproduces the problem:
 {code}
 final short REPL_FACTOR = 1;
 final long FILE_LENGTH = 512L;
 cluster.waitActive();
 FileSystem fs = cluster.getFileSystem();
 Path path = new Path(/corrupted);
 DFSTestUtil.createFile(fs, path, FILE_LENGTH, REPL_FACTOR, 12345L);
 DFSTestUtil.waitReplication(fs, path, REPL_FACTOR);
 ExtendedBlock block = DFSTestUtil.getFirstBlock(fs, path);
 int blockFilesCorrupted = cluster.corruptBlockOnDataNodes(block);
 assertEquals(All replicas not corrupted, REPL_FACTOR, 
 blockFilesCorrupted);
 InetSocketAddress nnAddr =
 new InetSocketAddress(localhost, cluster.getNameNodePort());
 DFSClient client = new DFSClient(nnAddr, conf);
 DFSInputStream dis = client.open(path.toString());
 byte[] arr = new byte[(int)FILE_LENGTH];
 boolean sawException = false;
 try {
   dis.read(arr, 0, (int)FILE_LENGTH);
 } catch (ChecksumException ex) { 
   sawException = true;
 }
 
 assertTrue(sawException);
 sawException = false;
 try {
   dis.read(arr, 0, (int)FILE_LENGTH); // -- NPE thrown here
 } catch (ChecksumException ex) { 
   sawException = true;
 } 
 {code}
 The stack:
 {code}
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:492)
   at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:545)
 [snip test stack]
 {code}
 and the problem is that currentNode is null. It's left at null after the 
 first read, which fails, and then is never refreshed because the condition in 
 read that protects blockSeekTo is only triggered if the current position is 
 outside the block's range. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system

2012-03-15 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230518#comment-13230518
 ] 

Todd Lipcon commented on HDFS-3099:
---

Excellent. +1 pending hudson

 SecondaryNameNode does not properly initialize metrics system
 -

 Key: HDFS-3099
 URL: https://issues.apache.org/jira/browse/HDFS-3099
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.2
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Attachments: HDFS-3099.patch, HDFS-3099.patch, HDFS-3099.patch


 The SecondaryNameNode is not properly initializing its metrics system. This 
 results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being 
 output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3100) failed to append data using webhdfs


[ 
https://issues.apache.org/jira/browse/HDFS-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230520#comment-13230520
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-3100:
--

It looks like that the BlockPoolSliceScanner incorrectly makes the replica as 
corrupted.
{noformat}
2012-03-15 13:26:27,317 WARN  datanode.BlockPoolSliceScanner 
(BlockPoolSliceScanner.java:verifyBlock(419)) - First Verification failed for 
BP-426067686-10.10.10.105-1331843180223:blk_-951537730291424878_1083
java.io.IOException: Stream closed
at java.io.BufferedInputStream.getInIfOpen(BufferedInputStream.java:134)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
at java.io.DataInputStream.readShort(DataInputStream.java:295)
at 
org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:78)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:228)
at 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.verifyBlock(BlockPoolSliceScanner.java:378)
at 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.verifyFirstBlock(BlockPoolSliceScanner.java:463)
at 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.scan(BlockPoolSliceScanner.java:594)
at 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.scanBlockPoolSlice(BlockPoolSliceScanner.java:570)
at 
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.run(DataBlockScanner.java:95)
at java.lang.Thread.run(Thread.java:680)
2012-03-15 13:26:27,320 WARN  datanode.BlockPoolSliceScanner 
(BlockPoolSliceScanner.java:verifyBlock(419)) - Second Verification failed for 
BP-426067686-10.10.10.105-1331843180223:blk_-951537730291424878_1083
java.io.IOException: Stream closed
at java.io.BufferedInputStream.getInIfOpen(BufferedInputStream.java:134)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
at java.io.DataInputStream.readShort(DataInputStream.java:295)
at 
org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:78)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:228)
at 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.verifyBlock(BlockPoolSliceScanner.java:378)
at 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.verifyFirstBlock(BlockPoolSliceScanner.java:463)
at 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.scan(BlockPoolSliceScanner.java:594)
at 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.scanBlockPoolSlice(BlockPoolSliceScanner.java:570)
at 
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.run(DataBlockScanner.java:95)
at java.lang.Thread.run(Thread.java:680)
2012-03-15 13:26:27,320 WARN  datanode.BlockPoolSliceScanner 
(BlockPoolSliceScanner.java:addBlock(234)) - Adding an already existing block 
BP-426067686-10.10.10.105-1331843180223:blk_-951537730291424878_1084
2012-03-15 13:26:27,320 INFO  datanode.BlockPoolSliceScanner 
(BlockPoolSliceScanner.java:handleScanFailure(301)) - Reporting bad block 
BP-426067686-10.10.10.105-1331843180223:blk_-951537730291424878_1083
2012-03-15 13:26:27,321 INFO  DataNode.clienttrace 
(BlockReceiver.java:run(1062)) - src: /127.0.0.1:54170, dest: /127.0.0.1:54083, 
bytes: 84992, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_2012624116_1, 
offset: 0, srvID: DS-600201831-10.10.10.105-54083-1331843
{noformat}


 failed to append data using webhdfs
 ---

 Key: HDFS-3100
 URL: https://issues.apache.org/jira/browse/HDFS-3100
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.23.1
Reporter: Zhanwei.Wang
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: hadoop-wangzw-datanode-ubuntu.log, 
 hadoop-wangzw-namenode-ubuntu.log, test.sh, testAppend.patch


 STEP：
 1, deploy a single node hdfs  0.23.1 cluster and configure hdfs as:
 A) enable webhdfs
 B) enable append
 C) disable permissions
 2, start hdfs
 3, run the test script as attached
 RESULT:
 expected: a file named testFile should be created and populated with 32K * 
 5000 zeros, HDFS should be OK.
 I got: script cannot be finished, file has been created but not be populated 
 as expected, actually append operation failed.
 Datanode log shows that, blockscaner report a bad replica and nanenode decide 
 to delete it. Since it is a single node cluster, append fail. It makes no 
 sense that the script failed every time.
 Datanode and Namenode logs are attached.

--
This

[jira] [Commented] (HDFS-3100) failed to append data using webhdfs

2012-03-15 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230519#comment-13230519
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-3100:
--

It looks like that the BlockPoolSliceScanner incorrectly makes the replica as 
corrupted.
{noformat}
2012-03-15 13:26:27,317 WARN  datanode.BlockPoolSliceScanner 
(BlockPoolSliceScanner.java:verifyBlock(419)) - First Verification failed for 
BP-426067686-10.10.10.105-1331843180223:blk_-951537730291424878_1083
java.io.IOException: Stream closed
at java.io.BufferedInputStream.getInIfOpen(BufferedInputStream.java:134)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
at java.io.DataInputStream.readShort(DataInputStream.java:295)
at 
org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:78)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:228)
at 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.verifyBlock(BlockPoolSliceScanner.java:378)
at 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.verifyFirstBlock(BlockPoolSliceScanner.java:463)
at 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.scan(BlockPoolSliceScanner.java:594)
at 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.scanBlockPoolSlice(BlockPoolSliceScanner.java:570)
at 
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.run(DataBlockScanner.java:95)
at java.lang.Thread.run(Thread.java:680)
2012-03-15 13:26:27,320 WARN  datanode.BlockPoolSliceScanner 
(BlockPoolSliceScanner.java:verifyBlock(419)) - Second Verification failed for 
BP-426067686-10.10.10.105-1331843180223:blk_-951537730291424878_1083
java.io.IOException: Stream closed
at java.io.BufferedInputStream.getInIfOpen(BufferedInputStream.java:134)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
at java.io.DataInputStream.readShort(DataInputStream.java:295)
at 
org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:78)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:228)
at 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.verifyBlock(BlockPoolSliceScanner.java:378)
at 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.verifyFirstBlock(BlockPoolSliceScanner.java:463)
at 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.scan(BlockPoolSliceScanner.java:594)
at 
org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.scanBlockPoolSlice(BlockPoolSliceScanner.java:570)
at 
org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.run(DataBlockScanner.java:95)
at java.lang.Thread.run(Thread.java:680)
2012-03-15 13:26:27,320 WARN  datanode.BlockPoolSliceScanner 
(BlockPoolSliceScanner.java:addBlock(234)) - Adding an already existing block 
BP-426067686-10.10.10.105-1331843180223:blk_-951537730291424878_1084
2012-03-15 13:26:27,320 INFO  datanode.BlockPoolSliceScanner 
(BlockPoolSliceScanner.java:handleScanFailure(301)) - Reporting bad block 
BP-426067686-10.10.10.105-1331843180223:blk_-951537730291424878_1083
2012-03-15 13:26:27,321 INFO  DataNode.clienttrace 
(BlockReceiver.java:run(1062)) - src: /127.0.0.1:54170, dest: /127.0.0.1:54083, 
bytes: 84992, op: HDFS_WRITE, cliID: DFSClient_NONMAPREDUCE_2012624116_1, 
offset: 0, srvID: DS-600201831-10.10.10.105-54083-1331843
{noformat}


 failed to append data using webhdfs
 ---

 Key: HDFS-3100
 URL: https://issues.apache.org/jira/browse/HDFS-3100
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.23.1
Reporter: Zhanwei.Wang
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: hadoop-wangzw-datanode-ubuntu.log, 
 hadoop-wangzw-namenode-ubuntu.log, test.sh, testAppend.patch


 STEP：
 1, deploy a single node hdfs  0.23.1 cluster and configure hdfs as:
 A) enable webhdfs
 B) enable append
 C) disable permissions
 2, start hdfs
 3, run the test script as attached
 RESULT:
 expected: a file named testFile should be created and populated with 32K * 
 5000 zeros, HDFS should be OK.
 I got: script cannot be finished, file has been created but not be populated 
 as expected, actually append operation failed.
 Datanode log shows that, blockscaner report a bad replica and nanenode decide 
 to delete it. Since it is a single node cluster, append fail. It makes no 
 sense that the script failed every time.
 Datanode and Namenode logs are attached.

--
This

[jira] [Commented] (HDFS-3067) NPE in DFSInputStream.readBuffer if read is repeated on corrupted block


[ 
https://issues.apache.org/jira/browse/HDFS-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230525#comment-13230525
 ] 

Hudson commented on HDFS-3067:
--

Integrated in Hadoop-Hdfs-trunk-Commit #1954 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1954/])
HDFS-3067. NPE in DFSInputStream.readBuffer if read is repeated on 
corrupted block. Contributed by Henry Robinson. (Revision 1301182)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301182
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java


 NPE in DFSInputStream.readBuffer if read is repeated on corrupted block
 ---

 Key: HDFS-3067
 URL: https://issues.apache.org/jira/browse/HDFS-3067
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.24.0
Reporter: Henry Robinson
Assignee: Henry Robinson
 Fix For: 0.24.0

 Attachments: HDFS-3067.1.patch, HDFS-3607.patch


 With a singly-replicated block that's corrupted, issuing a read against it 
 twice in succession (e.g. if ChecksumException is caught by the client) gives 
 a NullPointerException.
 Here's the body of a test that reproduces the problem:
 {code}
 final short REPL_FACTOR = 1;
 final long FILE_LENGTH = 512L;
 cluster.waitActive();
 FileSystem fs = cluster.getFileSystem();
 Path path = new Path(/corrupted);
 DFSTestUtil.createFile(fs, path, FILE_LENGTH, REPL_FACTOR, 12345L);
 DFSTestUtil.waitReplication(fs, path, REPL_FACTOR);
 ExtendedBlock block = DFSTestUtil.getFirstBlock(fs, path);
 int blockFilesCorrupted = cluster.corruptBlockOnDataNodes(block);
 assertEquals(All replicas not corrupted, REPL_FACTOR, 
 blockFilesCorrupted);
 InetSocketAddress nnAddr =
 new InetSocketAddress(localhost, cluster.getNameNodePort());
 DFSClient client = new DFSClient(nnAddr, conf);
 DFSInputStream dis = client.open(path.toString());
 byte[] arr = new byte[(int)FILE_LENGTH];
 boolean sawException = false;
 try {
   dis.read(arr, 0, (int)FILE_LENGTH);
 } catch (ChecksumException ex) { 
   sawException = true;
 }
 
 assertTrue(sawException);
 sawException = false;
 try {
   dis.read(arr, 0, (int)FILE_LENGTH); // -- NPE thrown here
 } catch (ChecksumException ex) { 
   sawException = true;
 } 
 {code}
 The stack:
 {code}
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:492)
   at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:545)
 [snip test stack]
 {code}
 and the problem is that currentNode is null. It's left at null after the 
 first read, which fails, and then is never refreshed because the condition in 
 read that protects blockSeekTo is only triggered if the current position is 
 outside the block's range. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3067) NPE in DFSInputStream.readBuffer if read is repeated on corrupted block


[ 
https://issues.apache.org/jira/browse/HDFS-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230526#comment-13230526
 ] 

Hudson commented on HDFS-3067:
--

Integrated in Hadoop-Common-trunk-Commit #1879 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1879/])
HDFS-3067. NPE in DFSInputStream.readBuffer if read is repeated on 
corrupted block. Contributed by Henry Robinson. (Revision 1301182)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301182
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java


 NPE in DFSInputStream.readBuffer if read is repeated on corrupted block
 ---

 Key: HDFS-3067
 URL: https://issues.apache.org/jira/browse/HDFS-3067
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.24.0
Reporter: Henry Robinson
Assignee: Henry Robinson
 Fix For: 0.24.0

 Attachments: HDFS-3067.1.patch, HDFS-3607.patch


 With a singly-replicated block that's corrupted, issuing a read against it 
 twice in succession (e.g. if ChecksumException is caught by the client) gives 
 a NullPointerException.
 Here's the body of a test that reproduces the problem:
 {code}
 final short REPL_FACTOR = 1;
 final long FILE_LENGTH = 512L;
 cluster.waitActive();
 FileSystem fs = cluster.getFileSystem();
 Path path = new Path(/corrupted);
 DFSTestUtil.createFile(fs, path, FILE_LENGTH, REPL_FACTOR, 12345L);
 DFSTestUtil.waitReplication(fs, path, REPL_FACTOR);
 ExtendedBlock block = DFSTestUtil.getFirstBlock(fs, path);
 int blockFilesCorrupted = cluster.corruptBlockOnDataNodes(block);
 assertEquals(All replicas not corrupted, REPL_FACTOR, 
 blockFilesCorrupted);
 InetSocketAddress nnAddr =
 new InetSocketAddress(localhost, cluster.getNameNodePort());
 DFSClient client = new DFSClient(nnAddr, conf);
 DFSInputStream dis = client.open(path.toString());
 byte[] arr = new byte[(int)FILE_LENGTH];
 boolean sawException = false;
 try {
   dis.read(arr, 0, (int)FILE_LENGTH);
 } catch (ChecksumException ex) { 
   sawException = true;
 }
 
 assertTrue(sawException);
 sawException = false;
 try {
   dis.read(arr, 0, (int)FILE_LENGTH); // -- NPE thrown here
 } catch (ChecksumException ex) { 
   sawException = true;
 } 
 {code}
 The stack:
 {code}
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:492)
   at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:545)
 [snip test stack]
 {code}
 and the problem is that currentNode is null. It's left at null after the 
 first read, which fails, and then is never refreshed because the condition in 
 read that protects blockSeekTo is only triggered if the current position is 
 outside the block's range. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2949) HA: Add check to active state transition to prevent operator-induced split brain


[ 
https://issues.apache.org/jira/browse/HDFS-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230550#comment-13230550
 ] 

Todd Lipcon commented on HDFS-2949:
---

Another safety check here is to make sure that the transaction IDs match 
between the nodes before going active.

 HA: Add check to active state transition to prevent operator-induced split 
 brain
 

 Key: HDFS-2949
 URL: https://issues.apache.org/jira/browse/HDFS-2949
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha, name-node
Affects Versions: 0.24.0
Reporter: Todd Lipcon

 Currently, if the administrator mistakenly calls -transitionToActive on one 
 NN while the other one is still active, all hell will break loose. We can add 
 a simple check by having the NN make a getServiceState() RPC to its peer with 
 a short (~1 second?) timeout. If the RPC succeeds and indicates the other 
 node is active, it should refuse to enter active mode. If the RPC fails or 
 indicates standby, it can proceed.
 This is just meant as a preventative safety check - we still expect users to 
 use the -failover command which has other checks plus fencing built in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3101) cannot read empty file using webhdfs

2012-03-15 Thread Daryn Sharp (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230559#comment-13230559
 ] 

Daryn Sharp commented on HDFS-3101:
---

+1 Cute edge case.  Looks straightforward.

 cannot read empty file using webhdfs
 

 Key: HDFS-3101
 URL: https://issues.apache.org/jira/browse/HDFS-3101
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.23.1
Reporter: Zhanwei.Wang
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h3101_20120315.patch


 STEP:
 1, create a new EMPTY file
 2, read it using webhdfs.
 RESULT:
 expected: get a empty file
 I got: 
 {RemoteException:{exception:IOException,javaClassName:java.io.IOException,message:Offset=0
  out of the range [0, 0); OPEN, path=/testFile}}
 First of all, [0, 0) is not a valid range, and I think read a empty file 
 should be OK.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3067) NPE in DFSInputStream.readBuffer if read is repeated on corrupted block


[ 
https://issues.apache.org/jira/browse/HDFS-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230561#comment-13230561
 ] 

Hudson commented on HDFS-3067:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #1888 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1888/])
HDFS-3067. NPE in DFSInputStream.readBuffer if read is repeated on 
corrupted block. Contributed by Henry Robinson. (Revision 1301182)

 Result = ABORTED
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301182
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java


 NPE in DFSInputStream.readBuffer if read is repeated on corrupted block
 ---

 Key: HDFS-3067
 URL: https://issues.apache.org/jira/browse/HDFS-3067
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.24.0
Reporter: Henry Robinson
Assignee: Henry Robinson
 Fix For: 0.24.0

 Attachments: HDFS-3067.1.patch, HDFS-3607.patch


 With a singly-replicated block that's corrupted, issuing a read against it 
 twice in succession (e.g. if ChecksumException is caught by the client) gives 
 a NullPointerException.
 Here's the body of a test that reproduces the problem:
 {code}
 final short REPL_FACTOR = 1;
 final long FILE_LENGTH = 512L;
 cluster.waitActive();
 FileSystem fs = cluster.getFileSystem();
 Path path = new Path(/corrupted);
 DFSTestUtil.createFile(fs, path, FILE_LENGTH, REPL_FACTOR, 12345L);
 DFSTestUtil.waitReplication(fs, path, REPL_FACTOR);
 ExtendedBlock block = DFSTestUtil.getFirstBlock(fs, path);
 int blockFilesCorrupted = cluster.corruptBlockOnDataNodes(block);
 assertEquals(All replicas not corrupted, REPL_FACTOR, 
 blockFilesCorrupted);
 InetSocketAddress nnAddr =
 new InetSocketAddress(localhost, cluster.getNameNodePort());
 DFSClient client = new DFSClient(nnAddr, conf);
 DFSInputStream dis = client.open(path.toString());
 byte[] arr = new byte[(int)FILE_LENGTH];
 boolean sawException = false;
 try {
   dis.read(arr, 0, (int)FILE_LENGTH);
 } catch (ChecksumException ex) { 
   sawException = true;
 }
 
 assertTrue(sawException);
 sawException = false;
 try {
   dis.read(arr, 0, (int)FILE_LENGTH); // -- NPE thrown here
 } catch (ChecksumException ex) { 
   sawException = true;
 } 
 {code}
 The stack:
 {code}
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:492)
   at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:545)
 [snip test stack]
 {code}
 and the problem is that currentNode is null. It's left at null after the 
 first read, which fails, and then is never refreshed because the condition in 
 read that protects blockSeekTo is only triggered if the current position is 
 outside the block's range. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3102) Add CLI tool to initialize the shared-edits dir

2012-03-15 Thread Todd Lipcon (Created) (JIRA)

Add CLI tool to initialize the shared-edits dir
---

 Key: HDFS-3102
 URL: https://issues.apache.org/jira/browse/HDFS-3102
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha, name-node
Affects Versions: 0.24.0, 0.23.3
Reporter: Todd Lipcon


Currently in order to make a non-HA NN HA, you need to initialize the shared 
edits dir. This can be done manually by cping directories around. It would be 
preferable to add a namenode -initializeSharedEdits command to achieve this 
same effect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3103) HA NN should be able to handle some cases of storage dir recovery on start

2012-03-15 Thread Todd Lipcon (Created) (JIRA)

HA NN should be able to handle some cases of storage dir recovery on start
--

 Key: HDFS-3103
 URL: https://issues.apache.org/jira/browse/HDFS-3103
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha, name-node
Affects Versions: 0.24.0, 0.23.3
Reporter: Todd Lipcon


As a shortcut in developing HA, we elected not to deal with the case of storage 
directory recovery while HA is enabled. But there are many cases where we can 
and should handle it. For example, if the user configures two local dirs and 
one shared dir, but one of the local dirs is empty at startup, we should be 
able to re-format the empty dir from the other local dir.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system

[
https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230581#comment-13230581
]

Hadoop QA commented on HDFS-3099:
-

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12518530/HDFS-3099.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.cli.TestHDFSCLI

org.apache.hadoop.hdfs.server.namenode.TestValidateConfigurationSettings

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-HDFS-Build/2012//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2012//console

This message is automatically generated.

SecondaryNameNode does not properly initialize metrics system
-

Key: HDFS-3099
URL: https://issues.apache.org/jira/browse/HDFS-3099
Project: Hadoop HDFS
Issue Type: Bug
Components: name-node
Affects Versions: 0.23.2
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
Attachments: HDFS-3099.patch, HDFS-3099.patch, HDFS-3099.patch

The SecondaryNameNode is not properly initializing its metrics system. This
results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being
output.

[jira] [Created] (HDFS-3104) Add tests for mkdir -p

2012-03-15 Thread Daryn Sharp (Created) (JIRA)

Add tests for mkdir -p
--

 Key: HDFS-3104
 URL: https://issues.apache.org/jira/browse/HDFS-3104
 Project: Hadoop HDFS
  Issue Type: Test
  Components: test
Affects Versions: 0.24.0, 0.23.2
Reporter: Daryn Sharp
Assignee: Daryn Sharp


Add tests for HADOOP-8175.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3104) Add tests for mkdir -p


 [ 
https://issues.apache.org/jira/browse/HDFS-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-3104:
--

Target Version/s: 0.24.0, 0.23.2  (was: 0.23.2, 0.24.0)
  Status: Patch Available  (was: Open)

 Add tests for mkdir -p
 --

 Key: HDFS-3104
 URL: https://issues.apache.org/jira/browse/HDFS-3104
 Project: Hadoop HDFS
  Issue Type: Test
  Components: test
Affects Versions: 0.24.0, 0.23.2
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Attachments: HDFS-3104.patch


 Add tests for HADOOP-8175.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3104) Add tests for mkdir -p

2012-03-15 Thread Robert Joseph Evans (Commented) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-3104:
--

Attachment: HDFS-3104.patch

 Add tests for mkdir -p
 --

 Key: HDFS-3104
 URL: https://issues.apache.org/jira/browse/HDFS-3104
 Project: Hadoop HDFS
  Issue Type: Test
  Components: test
Affects Versions: 0.24.0, 0.23.2
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Attachments: HDFS-3104.patch


 Add tests for HADOOP-8175.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system

2012-03-15 Thread Aaron T. Myers (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230592#comment-13230592
 ] 

Aaron T. Myers commented on HDFS-3099:
--

The test failures are unrelated to this patch. TestHDFSCLI is currently failing 
on trunk, and the TestValidateConfigurationSettings failure seems spurious. It 
just passed for me just fine on my box.

I'm going to commit this momentarily.

 SecondaryNameNode does not properly initialize metrics system
 -

 Key: HDFS-3099
 URL: https://issues.apache.org/jira/browse/HDFS-3099
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.2
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Attachments: HDFS-3099.patch, HDFS-3099.patch, HDFS-3099.patch


 The SecondaryNameNode is not properly initializing its metrics system. This 
 results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being 
 output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3104) Add tests for mkdir -p


[ 
https://issues.apache.org/jira/browse/HDFS-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230606#comment-13230606
 ] 

Robert Joseph Evans commented on HDFS-3104:
---

I reviewed the tests here and the corresponding source code change in 
HADOOP-8175.  They both look good to me +1 (non-binding).

 Add tests for mkdir -p
 --

 Key: HDFS-3104
 URL: https://issues.apache.org/jira/browse/HDFS-3104
 Project: Hadoop HDFS
  Issue Type: Test
  Components: test
Affects Versions: 0.24.0, 0.23.2
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Attachments: HDFS-3104.patch


 Add tests for HADOOP-8175.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system


 [ 
https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3099:
-

   Resolution: Fixed
Fix Version/s: 0.23.3
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I've just committed this to trunk and branch-0.23.

 SecondaryNameNode does not properly initialize metrics system
 -

 Key: HDFS-3099
 URL: https://issues.apache.org/jira/browse/HDFS-3099
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.2
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Fix For: 0.23.3

 Attachments: HDFS-3099.patch, HDFS-3099.patch, HDFS-3099.patch


 The SecondaryNameNode is not properly initializing its metrics system. This 
 results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being 
 output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3101) cannot read empty file using webhdfs

[
https://issues.apache.org/jira/browse/HDFS-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230614#comment-13230614
]

Hadoop QA commented on HDFS-3101:
-

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12518520/h3101_20120315.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 6 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.cli.TestHDFSCLI

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-HDFS-Build/2019//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2019//console

This message is automatically generated.

cannot read empty file using webhdfs

Key: HDFS-3101
URL: https://issues.apache.org/jira/browse/HDFS-3101
Project: Hadoop HDFS
Issue Type: Bug
Components: hdfs client
Affects Versions: 0.23.1
Reporter: Zhanwei.Wang
Assignee: Tsz Wo (Nicholas), SZE
Attachments: h3101_20120315.patch

STEP:
1, create a new EMPTY file
2, read it using webhdfs.
RESULT:
expected: get a empty file
I got:
{RemoteException:{exception:IOException,javaClassName:java.io.IOException,message:Offset=0
out of the range [0, 0); OPEN, path=/testFile}}
First of all, [0, 0) is not a valid range, and I think read a empty file
should be OK.

[jira] [Commented] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system

[
https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230617#comment-13230617
]

Hadoop QA commented on HDFS-3099:
-

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12518530/HDFS-3099.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hdfs.TestDFSShell
org.apache.hadoop.hdfs.TestDFSClientRetries
org.apache.hadoop.cli.TestHDFSCLI

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-HDFS-Build/2018//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2018//console

This message is automatically generated.

SecondaryNameNode does not properly initialize metrics system
-

Attachments: HDFS-3099.patch, HDFS-3099.patch, HDFS-3099.patch

The SecondaryNameNode is not properly initializing its metrics system. This
results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being
output.

[jira] [Updated] (HDFS-3004) Implement Recovery Mode

2012-03-15 Thread Colin Patrick McCabe (Updated) (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Colin Patrick McCabe updated HDFS-3004:
---

Attachment: HDFS-3004.013.patch

* remove SkippableEditLogException, as it turned out not to be necessary

* test skipping in EditLogInputStream

Implement Recovery Mode
---

[jira] [Commented] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system


[ 
https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230630#comment-13230630
 ] 

Hudson commented on HDFS-3099:
--

Integrated in Hadoop-Common-trunk-Commit #1880 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1880/])
HDFS-3099. SecondaryNameNode does not properly initialize metrics system. 
Contributed by Aaron T. Myers. (Revision 1301222)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301222
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSecondaryWebUi.java


 SecondaryNameNode does not properly initialize metrics system
 -

 Key: HDFS-3099
 URL: https://issues.apache.org/jira/browse/HDFS-3099
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.2
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Fix For: 0.23.3

 Attachments: HDFS-3099.patch, HDFS-3099.patch, HDFS-3099.patch


 The SecondaryNameNode is not properly initializing its metrics system. This 
 results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being 
 output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs

2012-03-15 Thread Suresh Srinivas (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230629#comment-13230629
]

Suresh Srinivas commented on HDFS-3077:
---

bq. but like Einstein said, no simpler!
Its all relative :-)

BTW it would be good write design for this. That avoid lenghty comments and
keeps the summary of what is proposed in place, instead of scattering in
multiple comments.

bq. This is mostly great – so long as you have an external fencing strategy
which prevents the old active from attempting to continue to write after the
new active is trying to read.
External fencing is not needed, given active daemons having ability to fence.

bq. it gets the loggers to promise not to accept edits from the old active
The daemons can stop accepting writes when it realizes that active lock is no
longer held by the writer. Clearly an advantage of an active daemon compared to
using passive storage.

bq. But, we still have one more problem: given some txid N, we might have
multiple actives that have tried to write the same transaction ID. Example
scenario:
The case of writes making it though some daemons can also be solved. The writes
that have made through W daemons wins. The others are marked not in sync and
need to sync up. Explanation to follow.

The solution we are building is specific to namenode editlogs. There is only
one active writer (as Ivan brought up earlier). Here is the outline I am
thinking of.

Lets start with steady state with K of N journal deamons. When a journal daemon
fails, we roll the edits. When a journal daemon joins, we roll the edits. New
journal daemon could start syncing other finalized edits, while keeping track
of edits in progress. We also keep track of the list of the active daemons in
zookeeper. Rolling gives a logical point for newly joined daemon to sync up
(sort of like generation stamp).
During failover, the new active, gets from the actively written journals, the
point to which it has to sync up to. It then rolls the edits also to that
point. Rolling also gives you a way to discard extra journal records that made
it to W daemons, during failover. When there are overlapping records, say
e1-105 and e100-200, you read 100-105 from the second editlog, and discard it
from the first editlog.

Again there are scenarios that are missing here. I plan to post more details in
a design on this.

Quorum-based protocol for reading and writing edit logs
---

Key: HDFS-3077
URL: https://issues.apache.org/jira/browse/HDFS-3077
Project: Hadoop HDFS
Issue Type: New Feature
Components: ha, name-node
Reporter: Todd Lipcon
Assignee: Todd Lipcon

[jira] [Commented] (HDFS-3077) Quorum-based protocol for reading and writing edit logs

[
https://issues.apache.org/jira/browse/HDFS-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230634#comment-13230634
]

Todd Lipcon commented on HDFS-3077:
---

bq. The daemons can stop accepting writes when it realizes that active lock is
no longer held by the writer. Clearly an advantage of an active daemon compared
to using passive storage.
Relying on ZK here is insufficient - the actual protocol itself needs fencing
to guarantee that a quorum of loggers have seen the lost lock before the new
writer starts writing.

I agree with your later comments that rolling the edits is a helpful construct
here, but you need to also make sure there's consensus on the active writer
when beginning a new log segment.

I'm about halfway done with a prototype implementation of this, I should have
something to show by middle of next week. At that point I'll also post a more
thorough explanation of the design.

Quorum-based protocol for reading and writing edit logs
---

Key: HDFS-3077
URL: https://issues.apache.org/jira/browse/HDFS-3077
Project: Hadoop HDFS
Issue Type: New Feature
Components: ha, name-node
Reporter: Todd Lipcon
Assignee: Todd Lipcon

[jira] [Commented] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system


[ 
https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230641#comment-13230641
 ] 

Hudson commented on HDFS-3099:
--

Integrated in Hadoop-Common-0.23-Commit #685 (See 
[https://builds.apache.org/job/Hadoop-Common-0.23-Commit/685/])
HDFS-3099. SecondaryNameNode does not properly initialize metrics system. 
Contributed by Aaron T. Myers. (Revision 1301230)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301230
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSecondaryWebUi.java


 SecondaryNameNode does not properly initialize metrics system
 -

 Key: HDFS-3099
 URL: https://issues.apache.org/jira/browse/HDFS-3099
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.2
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Fix For: 0.23.3

 Attachments: HDFS-3099.patch, HDFS-3099.patch, HDFS-3099.patch


 The SecondaryNameNode is not properly initializing its metrics system. This 
 results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being 
 output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system

2012-03-15 Thread Tsz Wo (Nicholas), SZE (Created) (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230646#comment-13230646
 ] 

Hudson commented on HDFS-3099:
--

Integrated in Hadoop-Hdfs-0.23-Commit #676 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Commit/676/])
HDFS-3099. SecondaryNameNode does not properly initialize metrics system. 
Contributed by Aaron T. Myers. (Revision 1301230)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301230
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSecondaryWebUi.java


 SecondaryNameNode does not properly initialize metrics system
 -

 Key: HDFS-3099
 URL: https://issues.apache.org/jira/browse/HDFS-3099
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.2
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Fix For: 0.23.3

 Attachments: HDFS-3099.patch, HDFS-3099.patch, HDFS-3099.patch


 The SecondaryNameNode is not properly initializing its metrics system. This 
 results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being 
 output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3105) Add DatanodeStorage information to block recovery

Add DatanodeStorage information to block recovery
-

 Key: HDFS-3105
 URL: https://issues.apache.org/jira/browse/HDFS-3105
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system

2012-03-15 Thread Tsz Wo (Nicholas), SZE (Updated) (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230666#comment-13230666
 ] 

Hudson commented on HDFS-3099:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #1889 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1889/])
HDFS-3099. SecondaryNameNode does not properly initialize metrics system. 
Contributed by Aaron T. Myers. (Revision 1301222)

 Result = ABORTED
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301222
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSecondaryWebUi.java


 SecondaryNameNode does not properly initialize metrics system
 -

 Key: HDFS-3099
 URL: https://issues.apache.org/jira/browse/HDFS-3099
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.2
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Fix For: 0.23.3

 Attachments: HDFS-3099.patch, HDFS-3099.patch, HDFS-3099.patch


 The SecondaryNameNode is not properly initializing its metrics system. This 
 results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being 
 output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3105) Add DatanodeStorage information to block recovery


 [ 
https://issues.apache.org/jira/browse/HDFS-3105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3105:
-

Component/s: hdfs client
 data-node
Description: When recovering a block, the namenode and client do not have 
the datanode storage information of the block.  So namenode cannot add the 
block to the corresponding datanode storge block list.

 Add DatanodeStorage information to block recovery
 -

 Key: HDFS-3105
 URL: https://issues.apache.org/jira/browse/HDFS-3105
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node, hdfs client
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE

 When recovering a block, the namenode and client do not have the datanode 
 storage information of the block.  So namenode cannot add the block to the 
 corresponding datanode storge block list.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3062) Fail to submit mapred job on a secured-HA-HDFS: logic URI cannot be picked up by job submission.


[ 
https://issues.apache.org/jira/browse/HDFS-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230672#comment-13230672
 ] 

Hadoop QA commented on HDFS-3062:
-

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12517984/HDFS-3062-trunk-2.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests:
  org.apache.hadoop.cli.TestHDFSCLI

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2020//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2020//console

This message is automatically generated.

 Fail to submit mapred job on a secured-HA-HDFS: logic URI cannot be picked up 
 by job submission.
 

 Key: HDFS-3062
 URL: https://issues.apache.org/jira/browse/HDFS-3062
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha, security
Affects Versions: 0.24.0
Reporter: Mingjie Lai
Assignee: Mingjie Lai
Priority: Critical
 Attachments: HDFS-3062-trunk-2.patch, HDFS-3062-trunk.patch


 When testing the combination of NN HA + security + yarn, I found that the 
 mapred job submission cannot pick up the logic URI of a nameservice. 
 I have logic URI configured in core-site.xml
 {code}
 property
  namefs.defaultFS/name
  valuehdfs://ns1/value
 /property
 {code}
 HDFS client can work with the HA deployment/configs:
 {code}
 [root@nn1 hadoop]# hdfs dfs -ls /
 Found 6 items
 drwxr-xr-x   - hbase  hadoop  0 2012-03-07 20:42 /hbase
 drwxrwxrwx   - yarn   hadoop  0 2012-03-07 20:42 /logs
 drwxr-xr-x   - mapred hadoop  0 2012-03-07 20:42 /mapred
 drwxr-xr-x   - mapred hadoop  0 2012-03-07 20:42 /mr-history
 drwxrwxrwt   - hdfs   hadoop  0 2012-03-07 21:57 /tmp
 drwxr-xr-x   - hdfs   hadoop  0 2012-03-07 20:42 /user
 {code}
 but cannot submit a mapred job with security turned on
 {code}
 [root@nn1 hadoop]# /usr/lib/hadoop/bin/yarn --config ./conf jar 
 share/hadoop/mapreduce/hadoop-mapreduce-examples-0.24.0-SNAPSHOT.jar 
 randomwriter out
 Running 0 maps.
 Job started: Wed Mar 07 23:28:23 UTC 2012
 java.lang.IllegalArgumentException: java.net.UnknownHostException: ns1
   at 
 org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:431)
   at 
 org.apache.hadoop.security.SecurityUtil.buildDTServiceName(SecurityUtil.java:312)
   at 
 org.apache.hadoop.fs.FileSystem.getCanonicalServiceName(FileSystem.java:217)
   at 
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:119)
   at 
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:97)
   at 
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80)
   at 
 org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:137)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:411)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:326)
   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1221)
   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218)
 
 {code}0.24

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3099) SecondaryNameNode does not properly initialize metrics system


[ 
https://issues.apache.org/jira/browse/HDFS-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230676#comment-13230676
 ] 

Hudson commented on HDFS-3099:
--

Integrated in Hadoop-Mapreduce-0.23-Commit #693 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/693/])
HDFS-3099. SecondaryNameNode does not properly initialize metrics system. 
Contributed by Aaron T. Myers. (Revision 1301230)

 Result = ABORTED
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301230
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestSecondaryWebUi.java


 SecondaryNameNode does not properly initialize metrics system
 -

 Key: HDFS-3099
 URL: https://issues.apache.org/jira/browse/HDFS-3099
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.2
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Fix For: 0.23.3

 Attachments: HDFS-3099.patch, HDFS-3099.patch, HDFS-3099.patch


 The SecondaryNameNode is not properly initializing its metrics system. This 
 results in the UgiMetrics, Metrics subsystem stats, and JvmMetrics not being 
 output.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-3106) TestHDFSCLI fails on Test ls: Test for // globbing

2012-03-15 Thread Ravi Prakash (Created) (JIRA)

TestHDFSCLI fails on Test ls: Test for /*/* globbing 
---

 Key: HDFS-3106
 URL: https://issues.apache.org/jira/browse/HDFS-3106
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 0.23.2
Reporter: Ravi Prakash


This is the one and only test failure:
2012-03-15 18:06:42,068 INFO  cli.CLITestHelper 
(CLITestHelper.java:displayResults(156)) - 
---
2012-03-15 18:06:42,068 INFO  cli.CLITestHelper 
(CLITestHelper.java:displayResults(157)) - Test ID: [30]
2012-03-15 18:06:42,068 INFO  cli.CLITestHelper 
(CLITestHelper.java:displayResults(158)) -Test Description: [ls: 
Test for /*/* globbing ]
2012-03-15 18:06:42,068 INFO  cli.CLITestHelper 
(CLITestHelper.java:displayResults(159)) - 
2012-03-15 18:06:42,069 INFO  cli.CLITestHelper 
(CLITestHelper.java:displayResults(163)) -   Test Commands: [-fs 
hdfs://localhost.localdomain:32992 -mkdir /dir0]
2012-03-15 18:06:42,069 INFO  cli.CLITestHelper 
(CLITestHelper.java:displayResults(163)) -   Test Commands: [-fs 
hdfs://localhost.localdomain:32992 -mkdir /dir0/dir1]
2012-03-15 18:06:42,069 INFO  cli.CLITestHelper 
(CLITestHelper.java:displayResults(163)) -   Test Commands: [-fs 
hdfs://localhost.localdomain:32992 -touchz /dir0/dir1/file1]
2012-03-15 18:06:42,069 INFO  cli.CLITestHelper 
(CLITestHelper.java:displayResults(163)) -   Test Commands: [-fs 
hdfs://localhost.localdomain:32992 -ls -R /\*/\*]
2012-03-15 18:06:42,069 INFO  cli.CLITestHelper 
(CLITestHelper.java:displayResults(167)) - 
2012-03-15 18:06:42,069 INFO  cli.CLITestHelper 
(CLITestHelper.java:displayResults(170)) -Cleanup Commands: [-fs 
hdfs://localhost.localdomain:32992 -rm -r /dir0]
2012-03-15 18:06:42,070 INFO  cli.CLITestHelper 
(CLITestHelper.java:displayResults(174)) - 
2012-03-15 18:06:42,070 INFO  cli.CLITestHelper 
(CLITestHelper.java:displayResults(178)) -  Comparator: 
[RegexpComparator]
2012-03-15 18:06:42,070 INFO  cli.CLITestHelper 
(CLITestHelper.java:displayResults(180)) -  Comparision result:   [fail]
2012-03-15 18:06:42,070 INFO  cli.CLITestHelper 
(CLITestHelper.java:displayResults(182)) - Expected output:   
[^-rw-r--r--( )*1( )*[a-z]*( )*supergroup( )*0( )*[0-9]{4,}-[0-9]{2,}-[0-9]{2,} 
[0-9]{2,}:[0-9]{2,}( )*/dir0/dir1/file1]
2012-03-15 18:06:42,070 INFO  cli.CLITestHelper 
(CLITestHelper.java:displayResults(184)) -   Actual output:   [ls: 
`/\*/\*apos;: No such file or directory



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3094) add -nonInteractive and -force option to namenode -format command


 [ 
https://issues.apache.org/jira/browse/HDFS-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Gupta updated HDFS-3094:
--

Attachment: HDFS-3094.patch

added more error checking for invalid clusterid options and added tests for them

 add -nonInteractive and -force option to namenode -format command
 -

 Key: HDFS-3094
 URL: https://issues.apache.org/jira/browse/HDFS-3094
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.24.0, 1.0.2
Reporter: Arpit Gupta
Assignee: Arpit Gupta
 Attachments: HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, 
 HDFS-3094.branch-1.0.patch, HDFS-3094.branch-1.0.patch, HDFS-3094.docs.patch, 
 HDFS-3094.patch, HDFS-3094.patch


 Currently the bin/hadoop namenode -format prompts the user for a Y/N to setup 
 the directories in the local file system.
 -force : namenode formats the directories without prompting
 -nonInterActive : namenode format will return with an exit code of 1 if the 
 dir exists.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3104) Add tests for mkdir -p

[
https://issues.apache.org/jira/browse/HDFS-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230703#comment-13230703
]

Hadoop QA commented on HDFS-3104:
-

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12518549/HDFS-3104.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 8 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.cli.TestHDFSCLI

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-HDFS-Build/2021//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2021//console

This message is automatically generated.

Add tests for mkdir -p
--

Key: HDFS-3104
URL: https://issues.apache.org/jira/browse/HDFS-3104
Project: Hadoop HDFS
Issue Type: Test
Components: test
Affects Versions: 0.24.0, 0.23.2
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Attachments: HDFS-3104.patch

Add tests for HADOOP-8175.

[jira] [Commented] (HDFS-3062) Fail to submit mapred job on a secured-HA-HDFS: logic URI cannot be picked up by job submission.

2012-03-15 Thread Mingjie Lai (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230715#comment-13230715
 ] 

Mingjie Lai commented on HDFS-3062:
---

The test error is reported at HDFS-3106. It's not caused by the patch here. 

 Fail to submit mapred job on a secured-HA-HDFS: logic URI cannot be picked up 
 by job submission.
 

 Key: HDFS-3062
 URL: https://issues.apache.org/jira/browse/HDFS-3062
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha, security
Affects Versions: 0.24.0
Reporter: Mingjie Lai
Assignee: Mingjie Lai
Priority: Critical
 Attachments: HDFS-3062-trunk-2.patch, HDFS-3062-trunk.patch


 When testing the combination of NN HA + security + yarn, I found that the 
 mapred job submission cannot pick up the logic URI of a nameservice. 
 I have logic URI configured in core-site.xml
 {code}
 property
  namefs.defaultFS/name
  valuehdfs://ns1/value
 /property
 {code}
 HDFS client can work with the HA deployment/configs:
 {code}
 [root@nn1 hadoop]# hdfs dfs -ls /
 Found 6 items
 drwxr-xr-x   - hbase  hadoop  0 2012-03-07 20:42 /hbase
 drwxrwxrwx   - yarn   hadoop  0 2012-03-07 20:42 /logs
 drwxr-xr-x   - mapred hadoop  0 2012-03-07 20:42 /mapred
 drwxr-xr-x   - mapred hadoop  0 2012-03-07 20:42 /mr-history
 drwxrwxrwt   - hdfs   hadoop  0 2012-03-07 21:57 /tmp
 drwxr-xr-x   - hdfs   hadoop  0 2012-03-07 20:42 /user
 {code}
 but cannot submit a mapred job with security turned on
 {code}
 [root@nn1 hadoop]# /usr/lib/hadoop/bin/yarn --config ./conf jar 
 share/hadoop/mapreduce/hadoop-mapreduce-examples-0.24.0-SNAPSHOT.jar 
 randomwriter out
 Running 0 maps.
 Job started: Wed Mar 07 23:28:23 UTC 2012
 java.lang.IllegalArgumentException: java.net.UnknownHostException: ns1
   at 
 org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:431)
   at 
 org.apache.hadoop.security.SecurityUtil.buildDTServiceName(SecurityUtil.java:312)
   at 
 org.apache.hadoop.fs.FileSystem.getCanonicalServiceName(FileSystem.java:217)
   at 
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:119)
   at 
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:97)
   at 
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80)
   at 
 org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:137)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:411)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:326)
   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1221)
   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218)
 
 {code}0.24

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3062) Fail to submit mapred job on a secured-HA-HDFS: logic URI cannot be picked up by job submission.


[ 
https://issues.apache.org/jira/browse/HDFS-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230727#comment-13230727
 ] 

Todd Lipcon commented on HDFS-3062:
---

+1, will commit momentarily. Thanks for fixing this, Mingjie.

 Fail to submit mapred job on a secured-HA-HDFS: logic URI cannot be picked up 
 by job submission.
 

 Key: HDFS-3062
 URL: https://issues.apache.org/jira/browse/HDFS-3062
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha, security
Affects Versions: 0.24.0
Reporter: Mingjie Lai
Assignee: Mingjie Lai
Priority: Critical
 Attachments: HDFS-3062-trunk-2.patch, HDFS-3062-trunk.patch


 When testing the combination of NN HA + security + yarn, I found that the 
 mapred job submission cannot pick up the logic URI of a nameservice. 
 I have logic URI configured in core-site.xml
 {code}
 property
  namefs.defaultFS/name
  valuehdfs://ns1/value
 /property
 {code}
 HDFS client can work with the HA deployment/configs:
 {code}
 [root@nn1 hadoop]# hdfs dfs -ls /
 Found 6 items
 drwxr-xr-x   - hbase  hadoop  0 2012-03-07 20:42 /hbase
 drwxrwxrwx   - yarn   hadoop  0 2012-03-07 20:42 /logs
 drwxr-xr-x   - mapred hadoop  0 2012-03-07 20:42 /mapred
 drwxr-xr-x   - mapred hadoop  0 2012-03-07 20:42 /mr-history
 drwxrwxrwt   - hdfs   hadoop  0 2012-03-07 21:57 /tmp
 drwxr-xr-x   - hdfs   hadoop  0 2012-03-07 20:42 /user
 {code}
 but cannot submit a mapred job with security turned on
 {code}
 [root@nn1 hadoop]# /usr/lib/hadoop/bin/yarn --config ./conf jar 
 share/hadoop/mapreduce/hadoop-mapreduce-examples-0.24.0-SNAPSHOT.jar 
 randomwriter out
 Running 0 maps.
 Job started: Wed Mar 07 23:28:23 UTC 2012
 java.lang.IllegalArgumentException: java.net.UnknownHostException: ns1
   at 
 org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:431)
   at 
 org.apache.hadoop.security.SecurityUtil.buildDTServiceName(SecurityUtil.java:312)
   at 
 org.apache.hadoop.fs.FileSystem.getCanonicalServiceName(FileSystem.java:217)
   at 
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:119)
   at 
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:97)
   at 
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80)
   at 
 org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:137)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:411)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:326)
   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1221)
   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218)
 
 {code}0.24

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3062) Fail to submit mapred job on a secured-HA-HDFS: logic URI cannot be picked up by job submission.

2012-03-15 Thread Todd Lipcon (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3062:
--

  Resolution: Fixed
   Fix Version/s: 0.23.3
  0.24.0
Target Version/s: 0.24.0, 0.23.3  (was: 0.23.3, 0.24.0)
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Committed to branch-23 and trunk, thanks Mingjie

 Fail to submit mapred job on a secured-HA-HDFS: logic URI cannot be picked up 
 by job submission.
 

 Key: HDFS-3062
 URL: https://issues.apache.org/jira/browse/HDFS-3062
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha, security
Affects Versions: 0.24.0
Reporter: Mingjie Lai
Assignee: Mingjie Lai
Priority: Critical
 Fix For: 0.24.0, 0.23.3

 Attachments: HDFS-3062-trunk-2.patch, HDFS-3062-trunk.patch


 When testing the combination of NN HA + security + yarn, I found that the 
 mapred job submission cannot pick up the logic URI of a nameservice. 
 I have logic URI configured in core-site.xml
 {code}
 property
  namefs.defaultFS/name
  valuehdfs://ns1/value
 /property
 {code}
 HDFS client can work with the HA deployment/configs:
 {code}
 [root@nn1 hadoop]# hdfs dfs -ls /
 Found 6 items
 drwxr-xr-x   - hbase  hadoop  0 2012-03-07 20:42 /hbase
 drwxrwxrwx   - yarn   hadoop  0 2012-03-07 20:42 /logs
 drwxr-xr-x   - mapred hadoop  0 2012-03-07 20:42 /mapred
 drwxr-xr-x   - mapred hadoop  0 2012-03-07 20:42 /mr-history
 drwxrwxrwt   - hdfs   hadoop  0 2012-03-07 21:57 /tmp
 drwxr-xr-x   - hdfs   hadoop  0 2012-03-07 20:42 /user
 {code}
 but cannot submit a mapred job with security turned on
 {code}
 [root@nn1 hadoop]# /usr/lib/hadoop/bin/yarn --config ./conf jar 
 share/hadoop/mapreduce/hadoop-mapreduce-examples-0.24.0-SNAPSHOT.jar 
 randomwriter out
 Running 0 maps.
 Job started: Wed Mar 07 23:28:23 UTC 2012
 java.lang.IllegalArgumentException: java.net.UnknownHostException: ns1
   at 
 org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:431)
   at 
 org.apache.hadoop.security.SecurityUtil.buildDTServiceName(SecurityUtil.java:312)
   at 
 org.apache.hadoop.fs.FileSystem.getCanonicalServiceName(FileSystem.java:217)
   at 
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:119)
   at 
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:97)
   at 
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80)
   at 
 org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:137)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:411)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:326)
   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1221)
   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218)
 
 {code}0.24

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3098) Update FsShell tests for quoted metachars

2012-03-15 Thread Tsz Wo (Nicholas), SZE (Updated) (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230733#comment-13230733
]

Hadoop QA commented on HDFS-3098:
-

+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12518522/HDFS-3098.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 8 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 eclipse:eclipse. The patch built with eclipse:eclipse.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in .

+1 contrib tests. The patch passed contrib unit tests.

Test results:
https://builds.apache.org/job/PreCommit-HDFS-Build/2022//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2022//console

This message is automatically generated.

Update FsShell tests for quoted metachars
-

Key: HDFS-3098
URL: https://issues.apache.org/jira/browse/HDFS-3098
Project: Hadoop HDFS
Issue Type: Test
Components: test
Affects Versions: 0.24.0, 0.23.2
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Attachments: HDFS-3098.patch

Need to add tests to TestDFSShell for quoted metachars.

[jira] [Updated] (HDFS-3101) cannot read empty file using webhdfs


 [ 
https://issues.apache.org/jira/browse/HDFS-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3101:
-

Attachment: h3101_20120315_branch-1.patch

h3101_20120315_branch-1.patch: for branch-1.

 cannot read empty file using webhdfs
 

 Key: HDFS-3101
 URL: https://issues.apache.org/jira/browse/HDFS-3101
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.23.1
Reporter: Zhanwei.Wang
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h3101_20120315.patch, h3101_20120315_branch-1.patch


 STEP:
 1, create a new EMPTY file
 2, read it using webhdfs.
 RESULT:
 expected: get a empty file
 I got: 
 {RemoteException:{exception:IOException,javaClassName:java.io.IOException,message:Offset=0
  out of the range [0, 0); OPEN, path=/testFile}}
 First of all, [0, 0) is not a valid range, and I think read a empty file 
 should be OK.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3101) cannot read empty file using webhdfs

2012-03-15 Thread Tsz Wo (Nicholas), SZE (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3101:
-

   Resolution: Fixed
Fix Version/s: 0.23.3
   1.0.2
   0.23.2
   1.1.0
   0.24.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks Daryn for the review.

I have committed this.

 cannot read empty file using webhdfs
 

 Key: HDFS-3101
 URL: https://issues.apache.org/jira/browse/HDFS-3101
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.23.1
Reporter: Zhanwei.Wang
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0, 1.1.0, 0.23.2, 1.0.2, 0.23.3

 Attachments: h3101_20120315.patch, h3101_20120315_branch-1.patch


 STEP:
 1, create a new EMPTY file
 2, read it using webhdfs.
 RESULT:
 expected: get a empty file
 I got: 
 {RemoteException:{exception:IOException,javaClassName:java.io.IOException,message:Offset=0
  out of the range [0, 0); OPEN, path=/testFile}}
 First of all, [0, 0) is not a valid range, and I think read a empty file 
 should be OK.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3062) Fail to submit mapred job on a secured-HA-HDFS: logic URI cannot be picked up by job submission.


[ 
https://issues.apache.org/jira/browse/HDFS-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230749#comment-13230749
 ] 

Hudson commented on HDFS-3062:
--

Integrated in Hadoop-Common-0.23-Commit #687 (See 
[https://builds.apache.org/job/Hadoop-Common-0.23-Commit/687/])
HDFS-3062. Fix bug which prevented MR job submission from creating 
delegation tokens on an HA cluster. Contributed by Mingjie Lai. (Revision 
1301286)

 Result = SUCCESS
todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301286
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DistributedFileSystem.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDelegationTokensWithHA.java


 Fail to submit mapred job on a secured-HA-HDFS: logic URI cannot be picked up 
 by job submission.
 

 Key: HDFS-3062
 URL: https://issues.apache.org/jira/browse/HDFS-3062
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha, security
Affects Versions: 0.24.0
Reporter: Mingjie Lai
Assignee: Mingjie Lai
Priority: Critical
 Fix For: 0.24.0, 0.23.3

 Attachments: HDFS-3062-trunk-2.patch, HDFS-3062-trunk.patch


 When testing the combination of NN HA + security + yarn, I found that the 
 mapred job submission cannot pick up the logic URI of a nameservice. 
 I have logic URI configured in core-site.xml
 {code}
 property
  namefs.defaultFS/name
  valuehdfs://ns1/value
 /property
 {code}
 HDFS client can work with the HA deployment/configs:
 {code}
 [root@nn1 hadoop]# hdfs dfs -ls /
 Found 6 items
 drwxr-xr-x   - hbase  hadoop  0 2012-03-07 20:42 /hbase
 drwxrwxrwx   - yarn   hadoop  0 2012-03-07 20:42 /logs
 drwxr-xr-x   - mapred hadoop  0 2012-03-07 20:42 /mapred
 drwxr-xr-x   - mapred hadoop  0 2012-03-07 20:42 /mr-history
 drwxrwxrwt   - hdfs   hadoop  0 2012-03-07 21:57 /tmp
 drwxr-xr-x   - hdfs   hadoop  0 2012-03-07 20:42 /user
 {code}
 but cannot submit a mapred job with security turned on
 {code}
 [root@nn1 hadoop]# /usr/lib/hadoop/bin/yarn --config ./conf jar 
 share/hadoop/mapreduce/hadoop-mapreduce-examples-0.24.0-SNAPSHOT.jar 
 randomwriter out
 Running 0 maps.
 Job started: Wed Mar 07 23:28:23 UTC 2012
 java.lang.IllegalArgumentException: java.net.UnknownHostException: ns1
   at 
 org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:431)
   at 
 org.apache.hadoop.security.SecurityUtil.buildDTServiceName(SecurityUtil.java:312)
   at 
 org.apache.hadoop.fs.FileSystem.getCanonicalServiceName(FileSystem.java:217)
   at 
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:119)
   at 
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:97)
   at 
 org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80)
   at 
 org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:137)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:411)
   at 
 org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:326)
   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1221)
   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218)
 
 {code}0.24

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3101) cannot read empty file using webhdfs

2012-03-15 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230750#comment-13230750
 ] 

Hudson commented on HDFS-3101:
--

Integrated in Hadoop-Common-0.23-Commit #687 (See 
[https://builds.apache.org/job/Hadoop-Common-0.23-Commit/687/])
svn merge -c 1301287 from trunk for HDFS-3101. (Revision 1301288)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1301288
Files : 
* /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHdfsFileSystemContract.java


 cannot read empty file using webhdfs
 

 Key: HDFS-3101
 URL: https://issues.apache.org/jira/browse/HDFS-3101
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.23.1
Reporter: Zhanwei.Wang
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: 0.24.0, 1.1.0, 0.23.2, 1.0.2, 0.23.3

 Attachments: h3101_20120315.patch, h3101_20120315_branch-1.patch


 STEP:
 1, create a new EMPTY file
 2, read it using webhdfs.
 RESULT:
 expected: get a empty file
 I got: 
 {RemoteException:{exception:IOException,javaClassName:java.io.IOException,message:Offset=0
  out of the range [0, 0); OPEN, path=/testFile}}
 First of all, [0, 0) is not a valid range, and I think read a empty file 
 should be OK.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3098) Update FsShell tests for quoted metachars


[ 
https://issues.apache.org/jira/browse/HDFS-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230751#comment-13230751
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-3098:
--

+1
That's great.  The build is back to stable.

 Update FsShell tests for quoted metachars
 -

 Key: HDFS-3098
 URL: https://issues.apache.org/jira/browse/HDFS-3098
 Project: Hadoop HDFS
  Issue Type: Test
  Components: test
Affects Versions: 0.24.0, 0.23.2
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Attachments: HDFS-3098.patch


 Need to add tests to TestDFSShell for quoted metachars.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3101) cannot read empty file using webhdfs