date:20110519

[
https://issues.apache.org/jira/browse/HDFS-1957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036000#comment-13036000
]

Hadoop QA commented on HDFS-1957:
-

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12479719/HDFS-1957.patch
against trunk revision 1124459.

+1 @author. The patch does not contain any @author tags.

+0 tests included. The patch appears to be a documentation patch that
doesn't require tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these core unit tests:
org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
org.apache.hadoop.hdfs.TestFileConcurrentReader
org.apache.hadoop.tools.TestJMXGet

+1 contrib tests. The patch passed contrib unit tests.

+1 system test framework. The patch passed system test framework compile.

Test results:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/580//testReport/
Findbugs warnings:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/580//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/580//console

This message is automatically generated.

Documentation for HFTP
--

Key: HDFS-1957
URL: https://issues.apache.org/jira/browse/HDFS-1957
Project: Hadoop HDFS
Issue Type: Improvement
Components: documentation
Affects Versions: 0.23.0
Reporter: Ari Rabkin
Assignee: Ari Rabkin
Priority: Minor
Fix For: 0.23.0

Attachments: HDFS-1957.patch, HDFS-1957.patch, HDFS-1957.patch

There should be some documentation for HFTP.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1961) New architectural documentation created

2011-05-19 Thread Rick Kazman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rick Kazman updated HDFS-1961:
--

Attachment: HDFS ArchDoc.Jira.docx

This is a Word version of the architecture documentation. The HTML version can 
be found at: 
http://kazman.shidler.hawaii.edu/ArchDoc.html



 New architectural documentation created
 ---

 Key: HDFS-1961
 URL: https://issues.apache.org/jira/browse/HDFS-1961
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Affects Versions: 0.21.0
Reporter: Rick Kazman
  Labels: architecture, hadoop, newbie
 Fix For: 0.21.0

 Attachments: HDFS ArchDoc.Jira.docx


 This material provides an overview of the HDFS architecture and is intended 
 for contributors. The goal of this document is to provide a guide to the 
 overall structure of the HDFS code so that contributors can more effectively 
 understand how changes that they are considering can be made, and the 
 consequences of those changes. The assumption is that the reader has a basic 
 understanding of HDFS, its purpose, and how it fits into the Hadoop project 
 suite. 
 An HTML version of the architectural documentation can be found at:  
 http://kazman.shidler.hawaii.edu/ArchDoc.html
 All comments and suggestions for improvements are appreciated.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1961) New architectural documentation created

2011-05-19 Thread Rick Kazman (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036003#comment-13036003
 ] 

Rick Kazman commented on HDFS-1961:
---

We expect to be making periodic updates to this document. Our first update task 
is to add sequence diagrams to section 6.


 New architectural documentation created
 ---

 Key: HDFS-1961
 URL: https://issues.apache.org/jira/browse/HDFS-1961
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Affects Versions: 0.21.0
Reporter: Rick Kazman
  Labels: architecture, hadoop, newbie
 Fix For: 0.21.0

 Attachments: HDFS ArchDoc.Jira.docx


 This material provides an overview of the HDFS architecture and is intended 
 for contributors. The goal of this document is to provide a guide to the 
 overall structure of the HDFS code so that contributors can more effectively 
 understand how changes that they are considering can be made, and the 
 consequences of those changes. The assumption is that the reader has a basic 
 understanding of HDFS, its purpose, and how it fits into the Hadoop project 
 suite. 
 An HTML version of the architectural documentation can be found at:  
 http://kazman.shidler.hawaii.edu/ArchDoc.html
 All comments and suggestions for improvements are appreciated.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1575) viewing block from web UI broken


 [ 
https://issues.apache.org/jira/browse/HDFS-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-1575:
--

Attachment: hdfs-1575-trunk.3.patch

Hi Aaron. I noticed some opportunity for cleanup here:
- removed unused import of org.mortbay.Log
- cleaned up the JSP so there isn't duplicated code
- the case that blks == null or blks.size() == 0 and security was off was being 
handled incorrectly
- fixed a line that was super long

Can you take a look at my changes?

 viewing block from web UI broken
 

 Key: HDFS-1575
 URL: https://issues.apache.org/jira/browse/HDFS-1575
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Aaron T. Myers
Priority: Blocker
 Fix For: 0.22.0

 Attachments: HDFS-1575, hdfs-1575-trunk.0.patch, 
 hdfs-1575-trunk.1.patch, hdfs-1575-trunk.2.patch, hdfs-1575-trunk.3.patch


 DatanodeJspHelper seems to expect the file path to be in the path info of 
 the HttpRequest, rather than in a parameter. I see the following exception 
 when visiting the URL 
 {{http://localhost.localdomain:50075/browseBlock.jsp?blockId=5006108823351810567blockSize=20genstamp=1001filename=%2Fuser%2Ftodd%2FissuedatanodePort=50010namenodeInfoPort=50070}}
 java.io.FileNotFoundException: File does not exist: /
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInternal(FSNamesystem.java:834)
 ...
   at 
 org.apache.hadoop.hdfs.server.datanode.DatanodeJspHelper.generateFileDetails(DatanodeJspHelper.java:258)
   at 
 org.apache.hadoop.hdfs.server.datanode.browseBlock_jsp._jspService(browseBlock_jsp.java:79)
   at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1905) Improve the usability of namenode -format


 [ 
https://issues.apache.org/jira/browse/HDFS-1905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-1905:
--

Hadoop Flags: [Reviewed]
  Status: Patch Available  (was: Open)

 Improve the usability of namenode -format 
 --

 Key: HDFS-1905
 URL: https://issues.apache.org/jira/browse/HDFS-1905
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: Bharath Mundlapudi
Assignee: Bharath Mundlapudi
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1905-1.patch, HDFS-1905-2.patch


 While setting up 0.23 version based cluster, i ran into this issue. When i 
 issue a format namenode command, which got changed in 23, it should let the 
 user know to how to use this command in case where complete options were not 
 specified.
 ./hdfs namenode -format
 I get the following error msg, still its not clear what and how user should 
 use this command.
 11/05/09 15:36:25 ERROR namenode.NameNode: 
 java.lang.IllegalArgumentException: Format must be provided with clusterid
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1483)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1623)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1689)
  
 The usability of this command can be improved.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1575) viewing block from web UI broken


[ 
https://issues.apache.org/jira/browse/HDFS-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036014#comment-13036014
 ] 

Aaron T. Myers commented on HDFS-1575:
--

+1, looks good to me. I especially like the reworking of the loop which 
iterates over {{blks}}. Thanks for cleaning this up.

 viewing block from web UI broken
 

 Key: HDFS-1575
 URL: https://issues.apache.org/jira/browse/HDFS-1575
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Aaron T. Myers
Priority: Blocker
 Fix For: 0.22.0

 Attachments: HDFS-1575, hdfs-1575-trunk.0.patch, 
 hdfs-1575-trunk.1.patch, hdfs-1575-trunk.2.patch, hdfs-1575-trunk.3.patch


 DatanodeJspHelper seems to expect the file path to be in the path info of 
 the HttpRequest, rather than in a parameter. I see the following exception 
 when visiting the URL 
 {{http://localhost.localdomain:50075/browseBlock.jsp?blockId=5006108823351810567blockSize=20genstamp=1001filename=%2Fuser%2Ftodd%2FissuedatanodePort=50010namenodeInfoPort=50070}}
 java.io.FileNotFoundException: File does not exist: /
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInternal(FSNamesystem.java:834)
 ...
   at 
 org.apache.hadoop.hdfs.server.datanode.DatanodeJspHelper.generateFileDetails(DatanodeJspHelper.java:258)
   at 
 org.apache.hadoop.hdfs.server.datanode.browseBlock_jsp._jspService(browseBlock_jsp.java:79)
   at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1922) Recurring failure in TestJMXGet.testNameNode since build 477 on May 11


[ 
https://issues.apache.org/jira/browse/HDFS-1922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036016#comment-13036016
 ] 

Todd Lipcon commented on HDFS-1922:
---

It seems like it would be straight-forward to have a missing .properties file 
act like the default one that we check into conf/ (ie FileSink). That would 
make it less of an incompatible change, right?

 Recurring failure in TestJMXGet.testNameNode since build 477 on May 11
 --

 Key: HDFS-1922
 URL: https://issues.apache.org/jira/browse/HDFS-1922
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Matt Foley
Assignee: Luke Lu
 Fix For: 0.23.0

 Attachments: hdfs-1922-conf-v1.patch




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1013) Miscellaneous improvements to HTML markup for web UIs


 [ 
https://issues.apache.org/jira/browse/HDFS-1013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-1013:
--

Status: Open  (was: Patch Available)

 Miscellaneous improvements to HTML markup for web UIs
 -

 Key: HDFS-1013
 URL: https://issues.apache.org/jira/browse/HDFS-1013
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Todd Lipcon
Assignee: Eugene Koontz
Priority: Minor
  Labels: newbie
 Fix For: 0.20.3

 Attachments: HDFS-1013.patch


 The Web UIs have various bits of bad markup (eg missing head sections, some 
 pages missing CSS links, inconsistent td vs th for table headings). We should 
 fix this up.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1592) Datanode startup doesn't honor volumes.tolerated

[
https://issues.apache.org/jira/browse/HDFS-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036024#comment-13036024
]

Hadoop QA commented on HDFS-1592:
-

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12479721/HDFS-1592-3.patch
against trunk revision 1124459.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 5 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these core unit tests:
org.apache.hadoop.hdfs.TestDFSRemove
org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
org.apache.hadoop.hdfs.TestFileConcurrentReader
org.apache.hadoop.tools.TestJMXGet

+1 contrib tests. The patch passed contrib unit tests.

+1 system test framework. The patch passed system test framework compile.

Test results:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/581//testReport/
Findbugs warnings:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/581//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/581//console

This message is automatically generated.

Datanode startup doesn't honor volumes.tolerated
-

Key: HDFS-1592
URL: https://issues.apache.org/jira/browse/HDFS-1592
Project: Hadoop HDFS
Issue Type: Bug
Affects Versions: 0.20.204.0
Reporter: Bharath Mundlapudi
Assignee: Bharath Mundlapudi
Fix For: 0.20.204.0, 0.23.0

Attachments: HDFS-1592-1.patch, HDFS-1592-2.patch, HDFS-1592-3.patch,
HDFS-1592-rel20.patch

Datanode startup doesn't honor volumes.tolerated for hadoop 20 version.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1575) viewing block from web UI broken

[
https://issues.apache.org/jira/browse/HDFS-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036026#comment-13036026
]

Hadoop QA commented on HDFS-1575:
-

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12479722/hdfs-1575-trunk.2.patch
against trunk revision 1124459.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 5 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these core unit tests:
org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
org.apache.hadoop.hdfs.TestFileConcurrentReader
org.apache.hadoop.tools.TestJMXGet

+1 contrib tests. The patch passed contrib unit tests.

+1 system test framework. The patch passed system test framework compile.

Test results:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/582//testReport/
Findbugs warnings:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/582//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/582//console

This message is automatically generated.

viewing block from web UI broken

Key: HDFS-1575
URL: https://issues.apache.org/jira/browse/HDFS-1575
Project: Hadoop HDFS
Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Aaron T. Myers
Priority: Blocker
Fix For: 0.22.0

Attachments: HDFS-1575, hdfs-1575-trunk.0.patch,
hdfs-1575-trunk.1.patch, hdfs-1575-trunk.2.patch, hdfs-1575-trunk.3.patch

DatanodeJspHelper seems to expect the file path to be in the path info of
the HttpRequest, rather than in a parameter. I see the following exception
when visiting the URL
{{http://localhost.localdomain:50075/browseBlock.jsp?blockId=5006108823351810567blockSize=20genstamp=1001filename=%2Fuser%2Ftodd%2FissuedatanodePort=50010namenodeInfoPort=50070}}
java.io.FileNotFoundException: File does not exist: /
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInternal(FSNamesystem.java:834)
...
at
org.apache.hadoop.hdfs.server.datanode.DatanodeJspHelper.generateFileDetails(DatanodeJspHelper.java:258)
at
org.apache.hadoop.hdfs.server.datanode.browseBlock_jsp._jspService(browseBlock_jsp.java:79)
at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1958) Format confirmation prompt should be more lenient of its input

2011-05-19 Thread Eli Collins (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036033#comment-13036033
 ] 

Eli Collins commented on HDFS-1958:
---

+1  lgtm

 Format confirmation prompt should be more lenient of its input
 --

 Key: HDFS-1958
 URL: https://issues.apache.org/jira/browse/HDFS-1958
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.22.0

 Attachments: hdfs-1958.txt


 As reported on the mailing list, the namenode format prompt only accepts 'Y'. 
 We should also accept 'y' and 'yes' (non-case-sensitive).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1575) viewing block from web UI broken

[
https://issues.apache.org/jira/browse/HDFS-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036037#comment-13036037
]

Hadoop QA commented on HDFS-1575:
-

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12479729/hdfs-1575-trunk.3.patch
against trunk revision 1124459.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 6 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these core unit tests:
org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
org.apache.hadoop.hdfs.TestFileConcurrentReader
org.apache.hadoop.hdfs.TestHDFSTrash
org.apache.hadoop.hdfs.TestPipelines
org.apache.hadoop.tools.TestJMXGet

+1 contrib tests. The patch passed contrib unit tests.

+1 system test framework. The patch passed system test framework compile.

Test results:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/585//testReport/
Findbugs warnings:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/585//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/585//console

This message is automatically generated.

viewing block from web UI broken

Attachments: HDFS-1575, hdfs-1575-trunk.0.patch,
hdfs-1575-trunk.1.patch, hdfs-1575-trunk.2.patch, hdfs-1575-trunk.3.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1958) Format confirmation prompt should be more lenient of its input


 [ 
https://issues.apache.org/jira/browse/HDFS-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-1958:
--

   Resolution: Fixed
Fix Version/s: (was: 0.22.0)
   0.23.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

Thanks for review, Eli. I elected to only commit this to trunk since it's a new 
feature/improvement.

 Format confirmation prompt should be more lenient of its input
 --

 Key: HDFS-1958
 URL: https://issues.apache.org/jira/browse/HDFS-1958
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: hdfs-1958.txt


 As reported on the mailing list, the namenode format prompt only accepts 'Y'. 
 We should also accept 'y' and 'yes' (non-case-sensitive).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1905) Improve the usability of namenode -format

[
https://issues.apache.org/jira/browse/HDFS-1905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036043#comment-13036043
]

Hadoop QA commented on HDFS-1905:
-

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12479685/HDFS-1905-2.patch
against trunk revision 1124459.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 4 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these core unit tests:
org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
org.apache.hadoop.hdfs.TestFileConcurrentReader
org.apache.hadoop.tools.TestJMXGet

+1 contrib tests. The patch passed contrib unit tests.

+1 system test framework. The patch passed system test framework compile.

Test results:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/583//testReport/
Findbugs warnings:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/583//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/583//console

This message is automatically generated.

Improve the usability of namenode -format
--

Key: HDFS-1905
URL: https://issues.apache.org/jira/browse/HDFS-1905
Project: Hadoop HDFS
Issue Type: Bug
Components: name-node
Affects Versions: 0.23.0
Reporter: Bharath Mundlapudi
Assignee: Bharath Mundlapudi
Priority: Minor
Fix For: 0.23.0

Attachments: HDFS-1905-1.patch, HDFS-1905-2.patch

While setting up 0.23 version based cluster, i ran into this issue. When i
issue a format namenode command, which got changed in 23, it should let the
user know to how to use this command in case where complete options were not
specified.
./hdfs namenode -format
I get the following error msg, still its not clear what and how user should
use this command.
11/05/09 15:36:25 ERROR namenode.NameNode:
java.lang.IllegalArgumentException: Format must be provided with clusterid
at
org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1483)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1623)
at
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1689)

The usability of this command can be improved.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1575) viewing block from web UI broken


[ 
https://issues.apache.org/jira/browse/HDFS-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036045#comment-13036045
 ] 

Todd Lipcon commented on HDFS-1575:
---

I tried the failing tests locally and they pass. Will commit to 22 and trunk 
momentarily.

 viewing block from web UI broken
 

 Key: HDFS-1575
 URL: https://issues.apache.org/jira/browse/HDFS-1575
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Aaron T. Myers
Priority: Blocker
 Fix For: 0.22.0

 Attachments: HDFS-1575, hdfs-1575-trunk.0.patch, 
 hdfs-1575-trunk.1.patch, hdfs-1575-trunk.2.patch, hdfs-1575-trunk.3.patch


 DatanodeJspHelper seems to expect the file path to be in the path info of 
 the HttpRequest, rather than in a parameter. I see the following exception 
 when visiting the URL 
 {{http://localhost.localdomain:50075/browseBlock.jsp?blockId=5006108823351810567blockSize=20genstamp=1001filename=%2Fuser%2Ftodd%2FissuedatanodePort=50010namenodeInfoPort=50070}}
 java.io.FileNotFoundException: File does not exist: /
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInternal(FSNamesystem.java:834)
 ...
   at 
 org.apache.hadoop.hdfs.server.datanode.DatanodeJspHelper.generateFileDetails(DatanodeJspHelper.java:258)
   at 
 org.apache.hadoop.hdfs.server.datanode.browseBlock_jsp._jspService(browseBlock_jsp.java:79)
   at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1575) viewing block from web UI broken


[ 
https://issues.apache.org/jira/browse/HDFS-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036049#comment-13036049
 ] 

Todd Lipcon commented on HDFS-1575:
---

Committed to trunk. Looks like we need to alter the patch a little for 0.22 
since the federation stuff isn't there. Mind doing that?

 viewing block from web UI broken
 

 Key: HDFS-1575
 URL: https://issues.apache.org/jira/browse/HDFS-1575
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Aaron T. Myers
Priority: Blocker
 Fix For: 0.22.0

 Attachments: HDFS-1575, hdfs-1575-trunk.0.patch, 
 hdfs-1575-trunk.1.patch, hdfs-1575-trunk.2.patch, hdfs-1575-trunk.3.patch


 DatanodeJspHelper seems to expect the file path to be in the path info of 
 the HttpRequest, rather than in a parameter. I see the following exception 
 when visiting the URL 
 {{http://localhost.localdomain:50075/browseBlock.jsp?blockId=5006108823351810567blockSize=20genstamp=1001filename=%2Fuser%2Ftodd%2FissuedatanodePort=50010namenodeInfoPort=50070}}
 java.io.FileNotFoundException: File does not exist: /
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInternal(FSNamesystem.java:834)
 ...
   at 
 org.apache.hadoop.hdfs.server.datanode.DatanodeJspHelper.generateFileDetails(DatanodeJspHelper.java:258)
   at 
 org.apache.hadoop.hdfs.server.datanode.browseBlock_jsp._jspService(browseBlock_jsp.java:79)
   at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1953) Change name node mxbean name in cluster web console


[ 
https://issues.apache.org/jira/browse/HDFS-1953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036048#comment-13036048
 ] 

Suresh Srinivas commented on HDFS-1953:
---

+1 for the patch. This is a simple change in the name of the mxbean. I am not 
planning to run hudson validation.

 Change name node mxbean name in cluster web console
 ---

 Key: HDFS-1953
 URL: https://issues.apache.org/jira/browse/HDFS-1953
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.23.0
Reporter: Tanping Wang
Assignee: Tanping Wang
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1953-1.patch


 name node mxbean name is changed after the new metrics framework is checked.  
 Need to change this in ClusterJspHelper.java in order for cluster web console 
 to work again.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HDFS-1953) Change name node mxbean name in cluster web console


 [ 
https://issues.apache.org/jira/browse/HDFS-1953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas resolved HDFS-1953.
---

  Resolution: Fixed
Hadoop Flags: [Reviewed]

I committed the patch. Thank you Tanping.

 Change name node mxbean name in cluster web console
 ---

 Key: HDFS-1953
 URL: https://issues.apache.org/jira/browse/HDFS-1953
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.23.0
Reporter: Tanping Wang
Assignee: Tanping Wang
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1953-1.patch


 name node mxbean name is changed after the new metrics framework is checked.  
 Need to change this in ClusterJspHelper.java in order for cluster web console 
 to work again.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1905) Improve the usability of namenode -format

[
https://issues.apache.org/jira/browse/HDFS-1905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036052#comment-13036052
]

Hadoop QA commented on HDFS-1905:
-

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12479685/HDFS-1905-2.patch
against trunk revision 1124459.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 4 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these core unit tests:
org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
org.apache.hadoop.hdfs.TestFileConcurrentReader
org.apache.hadoop.hdfs.TestGetBlocks
org.apache.hadoop.hdfs.TestHDFSTrash
org.apache.hadoop.tools.TestJMXGet

+1 contrib tests. The patch passed contrib unit tests.

+1 system test framework. The patch passed system test framework compile.

Test results:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/584//testReport/
Findbugs warnings:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/584//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/584//console

This message is automatically generated.

Improve the usability of namenode -format
--

Attachments: HDFS-1905-1.patch, HDFS-1905-2.patch

The usability of this command can be improved.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1875) MiniDFSCluster hard-codes dfs.datanode.address to localhost

2011-05-19 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036253#comment-13036253
 ] 

Eric Payne commented on HDFS-1875:
--

Test failures are not related to this patch.

They were failing in several of the previous builds as well. Reference Build 
#556, for e.g.

 MiniDFSCluster hard-codes dfs.datanode.address to localhost
 ---

 Key: HDFS-1875
 URL: https://issues.apache.org/jira/browse/HDFS-1875
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 0.22.0
Reporter: Eric Payne
Assignee: Eric Payne
 Fix For: 0.23.0

 Attachments: HDFS-1875.patch


 When creating RPC addresses that represent the communication sockets for each 
 simulated DataNode, the MiniDFSCluster class hard-codes the address of the 
 dfs.datanode.address port to be 127.0.0.1:0
 The DataNodeCluster test tool uses the MiniDFSCluster class to create a 
 selected number of simulated datanodes on a single host. In the 
 DataNodeCluster setup, the NameNode is not simulated but is started as a 
 separate daemon.
 The problem is that if the write requrests into the simulated datanodes are 
 originated on a host that is not the same host running the simulated 
 datanodes, the connections are refused. This is because the RPC sockets that 
 are started by MiniDFSCluster are for localhost (127.0.0.1) and are not 
 accessible from outside that same machine.
 It is proposed that the MiniDFSCluster.setupDatanodeAddress() method be 
 overloaded in order to accommodate an environment where the NameNode is on 
 one host, the client is on another host, and the simulated DataNodes are on 
 yet another host (or even multiple hosts simulating multiple DataNodes each).
 The overloaded API would add a parameter that would be used as the basis for 
 creating the RPS sockets. By default, it would remain 127.0.0.1

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-1962) Enhance MiniDFSCluster to improve testing of network topology distance related issues.

2011-05-19 Thread Eric Payne (JIRA)

Enhance MiniDFSCluster to improve testing of network topology distance related 
issues.
--

 Key: HDFS-1962
 URL: https://issues.apache.org/jira/browse/HDFS-1962
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: test
Affects Versions: 0.22.0
Reporter: Eric Payne
 Fix For: 0.23.0


In Jira HDFS-1875, Tanping Wang added the following comment. In order to keep 
the scope of HDFS-1875 small, I have created this Jira to capture this need.

-
It would be really useful if we can have multiple simulated data nodes binded 
to different hosts and dfs client binded to a particular host. And futher down 
the road, some of the simulated data nodes on different hosts, but the same 
rack. We can use this to test network topology distance related issues.

One of the related problem that I ran into was that the order of data nodes in 
LocatedBlock returned by name nodes is sorted by 
NetworkTopology#pseudoSortByDistance(). In current Mini dfs cluster, there is 
no way I can bind the client to a host or bind a simulated data node to a 
particular host/rack. It would be nice if mini dfs cluster can make this 
possible, so that the network topology distance of client to each data node is 
fixed. Therefore, the order of data nodes returned within a LocatedBlock on 
MiniDFS cluster is fixed. Currently the order of data nodes in LocatedBlock is 
randomly sorted which means NetworkTopology understand the DFSClient and 
simulated datanodes are not different hosts and different racks. 

Also in currently Mini DFS client provides the option of -racks when starting 
data nodes. But we can not bind multiple simulated data nodes to one rack... so 
it is not really that useful.
-


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1875) MiniDFSCluster hard-codes dfs.datanode.address to localhost

2011-05-19 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036257#comment-13036257
 ] 

Eric Payne commented on HDFS-1875:
--

In order to keep the scope of this Jira small, I have opened HDFS-1962 to cover 
Tanping's topology enhancement idea.

 MiniDFSCluster hard-codes dfs.datanode.address to localhost
 ---

 Key: HDFS-1875
 URL: https://issues.apache.org/jira/browse/HDFS-1875
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 0.22.0
Reporter: Eric Payne
Assignee: Eric Payne
 Fix For: 0.23.0

 Attachments: HDFS-1875.patch


 When creating RPC addresses that represent the communication sockets for each 
 simulated DataNode, the MiniDFSCluster class hard-codes the address of the 
 dfs.datanode.address port to be 127.0.0.1:0
 The DataNodeCluster test tool uses the MiniDFSCluster class to create a 
 selected number of simulated datanodes on a single host. In the 
 DataNodeCluster setup, the NameNode is not simulated but is started as a 
 separate daemon.
 The problem is that if the write requrests into the simulated datanodes are 
 originated on a host that is not the same host running the simulated 
 datanodes, the connections are refused. This is because the RPC sockets that 
 are started by MiniDFSCluster are for localhost (127.0.0.1) and are not 
 accessible from outside that same machine.
 It is proposed that the MiniDFSCluster.setupDatanodeAddress() method be 
 overloaded in order to accommodate an environment where the NameNode is on 
 one host, the client is on another host, and the simulated DataNodes are on 
 yet another host (or even multiple hosts simulating multiple DataNodes each).
 The overloaded API would add a parameter that would be used as the basis for 
 creating the RPS sockets. By default, it would remain 127.0.0.1

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1869) mkdirs should use the supplied permission for all of the created directories

2011-05-19 Thread Sanjay Radia (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036264#comment-13036264
 ] 

Sanjay Radia commented on HDFS-1869:


Daryn, have you determined if the semantics of mkdirs changed some time or if 
this bug always existed.

 mkdirs should use the supplied permission for all of the created directories
 

 Key: HDFS-1869
 URL: https://issues.apache.org/jira/browse/HDFS-1869
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Attachments: HDFS-1869-2.patch, HDFS-1869.patch


 Mkdirs only uses the supplied FsPermission for the last directory of the 
 path.  Paths 0..N-1 will all inherit the parent dir's permissions -even if- 
 inheritPermission is false.  This is a regression from somewhere around 
 0.20.9 and does not follow posix semantics.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1568) Improve DataXceiver error logging

2011-05-19 Thread Joey Echeverria (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joey Echeverria updated HDFS-1568:
--

Attachment: HDFS-1568-output-changes.patch

Here's a new patch that only includes the changes that affect output in the 
logs. The rest of the changes in the original patch do one of two things:

1) Re-format the code to be more consistent.
2) Replace calls to s.getRemoteSocketAddress() in log statements with 
references to remoteAddress which is set in the constructor.

 Improve DataXceiver error logging
 -

 Key: HDFS-1568
 URL: https://issues.apache.org/jira/browse/HDFS-1568
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Joey Echeverria
Priority: Minor
  Labels: newbie
 Attachments: HDFS-1568-1.patch, HDFS-1568-output-changes.patch


 In supporting customers we often see things like SocketTimeoutExceptions or 
 EOFExceptions coming from DataXceiver, but the logging isn't very good. For 
 example, if we get an IOE while setting up a connection to the downstream 
 mirror in writeBlock, the IP of the downstream mirror isn't logged on the DN 
 side.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1869) mkdirs should use the supplied permission for all of the created directories

2011-05-19 Thread Daryn Sharp (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036274#comment-13036274
 ] 

Daryn Sharp commented on HDFS-1869:
---

Yes, mkdirs used to be posix compliant, but was subsequently broken.  This is 
directly related to the linked HADOOP bug that mentioned the problem being 
introduced sometime after 0.20.9.  The broken behavior was introduced when 
another feature was added (my memory is fuzzy, I think it was quotas).

 mkdirs should use the supplied permission for all of the created directories
 

 Key: HDFS-1869
 URL: https://issues.apache.org/jira/browse/HDFS-1869
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Attachments: HDFS-1869-2.patch, HDFS-1869.patch


 Mkdirs only uses the supplied FsPermission for the last directory of the 
 path.  Paths 0..N-1 will all inherit the parent dir's permissions -even if- 
 inheritPermission is false.  This is a regression from somewhere around 
 0.20.9 and does not follow posix semantics.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-1963) HDFS rpm integration project

HDFS rpm integration project


 Key: HDFS-1963
 URL: https://issues.apache.org/jira/browse/HDFS-1963
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: build
 Environment: Java 6, RHEL 5.5
Reporter: Eric Yang
Assignee: Eric Yang


This jira is corresponding to HADOOP-6255 and associated directory layout 
change.  The patch for creating HDFS rpm packaging should be posted here for 
patch test build to verify against hdfs svn trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1963) HDFS rpm integration project


 [ 
https://issues.apache.org/jira/browse/HDFS-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated HDFS-1963:


Release Note: Create HDFS RPM package
  Status: Patch Available  (was: Open)

 HDFS rpm integration project
 

 Key: HDFS-1963
 URL: https://issues.apache.org/jira/browse/HDFS-1963
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: build
 Environment: Java 6, RHEL 5.5
Reporter: Eric Yang
Assignee: Eric Yang
 Attachments: HDFS-1963.patch


 This jira is corresponding to HADOOP-6255 and associated directory layout 
 change.  The patch for creating HDFS rpm packaging should be posted here for 
 patch test build to verify against hdfs svn trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1963) HDFS rpm integration project


 [ 
https://issues.apache.org/jira/browse/HDFS-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated HDFS-1963:


Attachment: HDFS-1963.patch

 HDFS rpm integration project
 

 Key: HDFS-1963
 URL: https://issues.apache.org/jira/browse/HDFS-1963
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: build
 Environment: Java 6, RHEL 5.5
Reporter: Eric Yang
Assignee: Eric Yang
 Attachments: HDFS-1963.patch


 This jira is corresponding to HADOOP-6255 and associated directory layout 
 change.  The patch for creating HDFS rpm packaging should be posted here for 
 patch test build to verify against hdfs svn trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1958) Format confirmation prompt should be more lenient of its input


[ 
https://issues.apache.org/jira/browse/HDFS-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036311#comment-13036311
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1958:
--

 well, fsck for example supports either 'y' or 'Y' for yes, and 'n' or 'N' for 
 no.

fsck and format are different: it is okay to accidentally click y for fsck 
but not for format.

I was asking if this is a good feature in [my previous 
comment|https://issues.apache.org/jira/browse/HDFS-1958?focusedCommentId=13035881page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13035881].
  I tried to change a few years back but I heard one argument saying that the 
command was deliberately designed for preventing accidentally format.

BTW, I think this should be classified as a newbie issue once we have decided 
to do it.

 Format confirmation prompt should be more lenient of its input
 --

 Key: HDFS-1958
 URL: https://issues.apache.org/jira/browse/HDFS-1958
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: hdfs-1958.txt


 As reported on the mailing list, the namenode format prompt only accepts 'Y'. 
 We should also accept 'y' and 'yes' (non-case-sensitive).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1371) One bad node can incorrectly flag many files as corrupt

2011-05-19 Thread Tanping Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tanping Wang updated HDFS-1371:
---

Status: Patch Available  (was: Open)

 One bad node can incorrectly flag many files as corrupt
 ---

 Key: HDFS-1371
 URL: https://issues.apache.org/jira/browse/HDFS-1371
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client, name-node
Affects Versions: 0.20.1, 0.23.0
 Environment: yahoo internal version 
 [knoguchi@gwgd4003 ~]$ hadoop version
 Hadoop 0.20.104.3.1007030707
Reporter: Koji Noguchi
Assignee: Tanping Wang
 Fix For: 0.23.0

 Attachments: HDFS-1371.04252011.patch, HDFS-1371.0503.patch, 
 HDFS-1371.0513.patch, HDFS-1371.0515.patch, HDFS-1371.0517.2.patch, 
 HDFS-1371.0517.patch, HDFS-1371.0518.2.patch, HDFS-1371.0518.patch


 On our cluster, 12 files were reported as corrupt by fsck even though the 
 replicas on the datanodes were healthy.
 Turns out that all the replicas (12 files x 3 replicas per file) were 
 reported corrupt from one node.
 Surprisingly, these files were still readable/accessible from dfsclient 
 (-get/-cat) without any problems.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1877) Create a functional test for file read/write


[ 
https://issues.apache.org/jira/browse/HDFS-1877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036320#comment-13036320
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1877:
--

- The variables, {{inJunitMode}}, {{BLOCK_SIZE}}, {{dfs}}, are not actually 
used.  Please remove them.

- How about the default {{filenameOption}} equals {{ROOT_DIR}}?

- You may simply have {{static private Log LOG = 
LogFactory.getLog(TestWriteRead.class);}}
{code}
+  static private Log LOG;
+
+  @Before
+  public void initJunitModeTest() throws Exception {
+LOG = LogFactory.getLog(TestWriteRead.class);
{code}

- Please remove the following.  The default is already INFO.
{code}
+((Log4JLogger) FSNamesystem.LOG).getLogger().setLevel(Level.INFO);
+((Log4JLogger) DFSClient.LOG).getLogger().setLevel(Level.INFO);
{code}

- Most public methods should be package private.

- Please add comments to tell how to use the command options and the default 
values.

 Create a functional test for file read/write
 

 Key: HDFS-1877
 URL: https://issues.apache.org/jira/browse/HDFS-1877
 Project: Hadoop HDFS
  Issue Type: Test
  Components: test
Affects Versions: 0.22.0
Reporter: CW Chung
Priority: Minor
 Attachments: TestWriteRead.java, TestWriteRead.patch


 It would be a great to have a tool, running on a real grid, to perform 
 function test (and stress tests to certain extent) for the file operations. 
 The tool would be written in Java and makes HDFS API calls to read, write, 
 append, hflush hadoop files. The tool would be usable standalone, or as a 
 building block for other regression or stress test suites (written in shell, 
 perl, python, etc).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1922) Recurring failure in TestJMXGet.testNameNode since build 477 on May 11

2011-05-19 Thread Luke Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036326#comment-13036326
 ] 

Luke Lu commented on HDFS-1922:
---

The only difference between the new behavior and metrics v1 is that in metrics 
v1, the metrics related mbeans are started whether or not metrics context are 
configured.

In hindsight, I think I should've treated missing config as default/empty 
config for better compatibility and less surprises. I just opened HADOOP-7306 
to revert the metrics system to the old behavior.

 Recurring failure in TestJMXGet.testNameNode since build 477 on May 11
 --

 Key: HDFS-1922
 URL: https://issues.apache.org/jira/browse/HDFS-1922
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Matt Foley
Assignee: Luke Lu
 Fix For: 0.23.0

 Attachments: hdfs-1922-conf-v1.patch




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1957) Documentation for HFTP


[ 
https://issues.apache.org/jira/browse/HDFS-1957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036327#comment-13036327
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1957:
--

 ... Is the current text a bad way to say that?

The current text is good.  I misread it earlier.

+1 patch looks good.

 Documentation for HFTP
 --

 Key: HDFS-1957
 URL: https://issues.apache.org/jira/browse/HDFS-1957
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Affects Versions: 0.23.0
Reporter: Ari Rabkin
Assignee: Ari Rabkin
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1957.patch, HDFS-1957.patch, HDFS-1957.patch


 There should be some documentation for HFTP.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1905) Improve the usability of namenode -format


[ 
https://issues.apache.org/jira/browse/HDFS-1905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036330#comment-13036330
 ] 

Suresh Srinivas commented on HDFS-1905:
---

+1 for the patch.

 Improve the usability of namenode -format 
 --

 Key: HDFS-1905
 URL: https://issues.apache.org/jira/browse/HDFS-1905
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: Bharath Mundlapudi
Assignee: Bharath Mundlapudi
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1905-1.patch, HDFS-1905-2.patch


 While setting up 0.23 version based cluster, i ran into this issue. When i 
 issue a format namenode command, which got changed in 23, it should let the 
 user know to how to use this command in case where complete options were not 
 specified.
 ./hdfs namenode -format
 I get the following error msg, still its not clear what and how user should 
 use this command.
 11/05/09 15:36:25 ERROR namenode.NameNode: 
 java.lang.IllegalArgumentException: Format must be provided with clusterid
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1483)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1623)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1689)
  
 The usability of this command can be improved.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1371) One bad node can incorrectly flag many files as corrupt

2011-05-19 Thread Tanping Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tanping Wang updated HDFS-1371:
---

Status: Patch Available  (was: Open)

 One bad node can incorrectly flag many files as corrupt
 ---

 Key: HDFS-1371
 URL: https://issues.apache.org/jira/browse/HDFS-1371
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client, name-node
Affects Versions: 0.20.1, 0.23.0
 Environment: yahoo internal version 
 [knoguchi@gwgd4003 ~]$ hadoop version
 Hadoop 0.20.104.3.1007030707
Reporter: Koji Noguchi
Assignee: Tanping Wang
 Fix For: 0.23.0

 Attachments: HDFS-1371.04252011.patch, HDFS-1371.0503.patch, 
 HDFS-1371.0513.patch, HDFS-1371.0515.patch, HDFS-1371.0517.2.patch, 
 HDFS-1371.0517.patch, HDFS-1371.0518.2.patch, HDFS-1371.0518.patch


 On our cluster, 12 files were reported as corrupt by fsck even though the 
 replicas on the datanodes were healthy.
 Turns out that all the replicas (12 files x 3 replicas per file) were 
 reported corrupt from one node.
 Surprisingly, these files were still readable/accessible from dfsclient 
 (-get/-cat) without any problems.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1905) Improve the usability of namenode -format


 [ 
https://issues.apache.org/jira/browse/HDFS-1905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-1905:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

I committed the patch. Thank you Bharath.

 Improve the usability of namenode -format 
 --

 Key: HDFS-1905
 URL: https://issues.apache.org/jira/browse/HDFS-1905
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: Bharath Mundlapudi
Assignee: Bharath Mundlapudi
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1905-1.patch, HDFS-1905-2.patch


 While setting up 0.23 version based cluster, i ran into this issue. When i 
 issue a format namenode command, which got changed in 23, it should let the 
 user know to how to use this command in case where complete options were not 
 specified.
 ./hdfs namenode -format
 I get the following error msg, still its not clear what and how user should 
 use this command.
 11/05/09 15:36:25 ERROR namenode.NameNode: 
 java.lang.IllegalArgumentException: Format must be provided with clusterid
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1483)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1623)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1689)
  
 The usability of this command can be improved.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1958) Format confirmation prompt should be more lenient of its input

2011-05-19 Thread Jakob Homan (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036339#comment-13036339
 ] 

Jakob Homan commented on HDFS-1958:
---

Less than 24 hours between this issue being opened and committed, on 
non-critical issues, seems a little short.  Perhaps the community should be 
given more of a chance to weigh in before committing things, particularly when 
an experienced commmitter is raising questions about it?

 Format confirmation prompt should be more lenient of its input
 --

 Key: HDFS-1958
 URL: https://issues.apache.org/jira/browse/HDFS-1958
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: hdfs-1958.txt


 As reported on the mailing list, the namenode format prompt only accepts 'Y'. 
 We should also accept 'y' and 'yes' (non-case-sensitive).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

[
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036340#comment-13036340
]

Tsz Wo (Nicholas), SZE commented on HDFS-1057:
--

I believe this test mostly fails on the build infrastructure ...

It seems that the machines on the build infrastructure are slow/old/heavily
loaded. Tests may be easier to fail there but not locally. So that choosing
the test parameters, e.g. how many concurrent writer, becomes non-trivial.

Concurrent readers hit ChecksumExceptions if following a writer to very end
of file
---

Key: HDFS-1057
URL: https://issues.apache.org/jira/browse/HDFS-1057
Project: Hadoop HDFS
Issue Type: Sub-task
Components: data-node
Affects Versions: 0.20-append, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
Fix For: 0.20-append, 0.21.0, 0.22.0

Attachments: HDFS-1057-0.20-append.patch,
conurrent-reader-patch-1.txt, conurrent-reader-patch-2.txt,
conurrent-reader-patch-3.txt, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt,
hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt, hdfs-1057-trunk-5.txt,
hdfs-1057-trunk-6.txt

In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before
calling flush(). Therefore, if there is a concurrent reader, it's possible to
race here - the reader will see the new length while those bytes are still in
the buffers of BlockReceiver. Thus the client will potentially see checksum
errors or EOFs. Additionally, the last checksum chunk of the file is made
accessible to readers even though it is not stable.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1941) Remove -genclusterid from NameNode startup options


 [ 
https://issues.apache.org/jira/browse/HDFS-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-1941:
--

  Component/s: name-node
Fix Version/s: 0.23.0

 Remove -genclusterid from NameNode startup options
 --

 Key: HDFS-1941
 URL: https://issues.apache.org/jira/browse/HDFS-1941
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: Bharath Mundlapudi
Assignee: Bharath Mundlapudi
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1941-1.patch


 Currently, namenode -genclusterid is a helper utility to generate unique 
 clusterid. This option is useless once namenode -format automatically 
 generates the clusterid.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1941) Remove -genclusterid from NameNode startup options


 [ 
https://issues.apache.org/jira/browse/HDFS-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-1941:
--

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

I committed the patch. Thank you Bharath.

 Remove -genclusterid from NameNode startup options
 --

 Key: HDFS-1941
 URL: https://issues.apache.org/jira/browse/HDFS-1941
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: Bharath Mundlapudi
Assignee: Bharath Mundlapudi
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1941-1.patch


 Currently, namenode -genclusterid is a helper utility to generate unique 
 clusterid. This option is useless once namenode -format automatically 
 generates the clusterid.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2011-05-19 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036343#comment-13036343
 ] 

sam rash commented on HDFS-1057:


if it helps, there is only ever 1 writer + 1 reader in the test.  1 reader 
'tails' by opening and closing the file repeatedly, up to 1000 times (hence 
exposing socket leaks in the past)


 Concurrent readers hit ChecksumExceptions if following a writer to very end 
 of file
 ---

 Key: HDFS-1057
 URL: https://issues.apache.org/jira/browse/HDFS-1057
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node
Affects Versions: 0.20-append, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
 Fix For: 0.20-append, 0.21.0, 0.22.0

 Attachments: HDFS-1057-0.20-append.patch, 
 conurrent-reader-patch-1.txt, conurrent-reader-patch-2.txt, 
 conurrent-reader-patch-3.txt, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, 
 hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt, hdfs-1057-trunk-5.txt, 
 hdfs-1057-trunk-6.txt


 In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
 calling flush(). Therefore, if there is a concurrent reader, it's possible to 
 race here - the reader will see the new length while those bytes are still in 
 the buffers of BlockReceiver. Thus the client will potentially see checksum 
 errors or EOFs. Additionally, the last checksum chunk of the file is made 
 accessible to readers even though it is not stable.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1958) Format confirmation prompt should be more lenient of its input


[ 
https://issues.apache.org/jira/browse/HDFS-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036347#comment-13036347
 ] 

Todd Lipcon commented on HDFS-1958:
---

bq. particularly when an experienced commmitter is raising questions about it?

Excuse me - I took Nicholas's question for a joke, to be honest, given it 
referenced high school students and didn't raise technical objections.

bq. fsck and format are different: it is okay to accidentally click y for 
fsck but not for format.

OK, another comparison: mke2fs doesn't ask for confirmation at all. I checked 
this across ext2, ext3, and ntfs.

 Format confirmation prompt should be more lenient of its input
 --

 Key: HDFS-1958
 URL: https://issues.apache.org/jira/browse/HDFS-1958
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: hdfs-1958.txt


 As reported on the mailing list, the namenode format prompt only accepts 'Y'. 
 We should also accept 'y' and 'yes' (non-case-sensitive).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file


[ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036348#comment-13036348
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1057:
--

Sam, could you either investigate the underlying problem or improve the test so 
that it won't fail on hudson?

 Concurrent readers hit ChecksumExceptions if following a writer to very end 
 of file
 ---

 Key: HDFS-1057
 URL: https://issues.apache.org/jira/browse/HDFS-1057
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node
Affects Versions: 0.20-append, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
 Fix For: 0.20-append, 0.21.0, 0.22.0

 Attachments: HDFS-1057-0.20-append.patch, 
 conurrent-reader-patch-1.txt, conurrent-reader-patch-2.txt, 
 conurrent-reader-patch-3.txt, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, 
 hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt, hdfs-1057-trunk-5.txt, 
 hdfs-1057-trunk-6.txt


 In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
 calling flush(). Therefore, if there is a concurrent reader, it's possible to 
 race here - the reader will see the new length while those bytes are still in 
 the buffers of BlockReceiver. Thus the client will potentially see checksum 
 errors or EOFs. Additionally, the last checksum chunk of the file is made 
 accessible to readers even though it is not stable.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1922) Recurring failure in TestJMXGet.testNameNode since build 477 on May 11


[ 
https://issues.apache.org/jira/browse/HDFS-1922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036349#comment-13036349
 ] 

Todd Lipcon commented on HDFS-1922:
---

cool, thanks Luke. +1 on this patch to fix the tests, then.

 Recurring failure in TestJMXGet.testNameNode since build 477 on May 11
 --

 Key: HDFS-1922
 URL: https://issues.apache.org/jira/browse/HDFS-1922
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Matt Foley
Assignee: Luke Lu
 Fix For: 0.23.0

 Attachments: hdfs-1922-conf-v1.patch




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1958) Format confirmation prompt should be more lenient of its input


[ 
https://issues.apache.org/jira/browse/HDFS-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036353#comment-13036353
 ] 

Todd Lipcon commented on HDFS-1958:
---

btw, for those who might be concerned about accidentally formatting the NN 
(perhaps your cat likes to jump on the 'y' key and then the enter key), you can 
also enable HDFS-718 to completely disallow it.

 Format confirmation prompt should be more lenient of its input
 --

 Key: HDFS-1958
 URL: https://issues.apache.org/jira/browse/HDFS-1958
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: hdfs-1958.txt


 As reported on the mailing list, the namenode format prompt only accepts 'Y'. 
 We should also accept 'y' and 'yes' (non-case-sensitive).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1952) FSEditLog.open() appears to succeed even if all EDITS directories fail


[ 
https://issues.apache.org/jira/browse/HDFS-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036366#comment-13036366
 ] 

Matt Foley commented on HDFS-1952:
--

Agree.  Maybe change the exception message to Failed to initialize edits log 
in any storage directory.

The test-patch failures are recurring issues unrelated to this patch.


 FSEditLog.open() appears to succeed even if all EDITS directories fail
 --

 Key: HDFS-1952
 URL: https://issues.apache.org/jira/browse/HDFS-1952
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: Matt Foley
Assignee: Andrew Wang
  Labels: newbie
 Attachments: hdfs-1952.patch


 FSEditLog.open() appears to succeed even if all of the individual 
 directories failed to allow creation of an EditLogOutputStream.  The problem 
 and solution are essentially similar to that of HDFS-1505.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1922) Recurring failure in TestJMXGet.testNameNode since build 477 on May 11


 [ 
https://issues.apache.org/jira/browse/HDFS-1922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-1922:
--

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Luke.

 Recurring failure in TestJMXGet.testNameNode since build 477 on May 11
 --

 Key: HDFS-1922
 URL: https://issues.apache.org/jira/browse/HDFS-1922
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Matt Foley
Assignee: Luke Lu
 Fix For: 0.23.0

 Attachments: hdfs-1922-conf-v1.patch




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1957) Documentation for HFTP


 [ 
https://issues.apache.org/jira/browse/HDFS-1957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-1957:
--

   Resolution: Fixed
Fix Version/s: (was: 0.23.0)
   0.22.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

Committed to 22 and trunk. Thanks, Ari!

 Documentation for HFTP
 --

 Key: HDFS-1957
 URL: https://issues.apache.org/jira/browse/HDFS-1957
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Affects Versions: 0.23.0
Reporter: Ari Rabkin
Assignee: Ari Rabkin
Priority: Minor
 Fix For: 0.22.0

 Attachments: HDFS-1957.patch, HDFS-1957.patch, HDFS-1957.patch


 There should be some documentation for HFTP.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1505) saveNamespace appears to succeed even if all directories fail to save


[ 
https://issues.apache.org/jira/browse/HDFS-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036376#comment-13036376
 ] 

Matt Foley commented on HDFS-1505:
--

Reading the HDFS-1073 spec, I infer that fsimage files will have a tag 
identifying the last txn included in the image, and edits logs will have tags 
for the first and last txn included in them.  And you're referring to the 
resulting fact that one could take an image ending with txn 100, jump into the 
middle of a log file that went from txn 50 to 170, and successfully generate 
the in-memory structures current as of txn 170.  Is that right?

If the above understanding is correct, then I agree it seems that 
saveNamespace() should just save the fsimage file.  Although it doesn't hurt to 
also clear the edits logs, once you have multiple copies of the fsimage.  Does 
your log-rolling logic automatically delete log chunk files older than 
available fsimage files?  That would be sufficient edits file management.

 saveNamespace appears to succeed even if all directories fail to save
 -

 Key: HDFS-1505
 URL: https://issues.apache.org/jira/browse/HDFS-1505
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: Todd Lipcon
Assignee: Aaron T. Myers
Priority: Blocker
 Fix For: 0.22.0

 Attachments: hdfs-1505-1-test.txt, hdfs-1505-22.0.patch, 
 hdfs-1505-22.1.patch, hdfs-1505-22.2.patch, hdfs-1505-test.txt, 
 hdfs-1505-trunk.0.patch, hdfs-1505-trunk.1.patch, hdfs-1505-trunk.2.patch, 
 hdfs-1505-trunk.3.patch


 After HDFS-1071, saveNamespace now appears to succeed even if all of the 
 individual directories failed to save.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1568) Improve DataXceiver error logging

[
https://issues.apache.org/jira/browse/HDFS-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036383#comment-13036383
]

Hadoop QA commented on HDFS-1568:
-

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12479798/HDFS-1568-output-changes.patch
against trunk revision 1124576.

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified
tests.
Please justify why no new tests are needed for this
patch.
Also please list what manual steps were performed to
verify this patch.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 contrib tests. The patch passed contrib unit tests.

+1 system test framework. The patch passed system test framework compile.

Test results:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/586//testReport/
Findbugs warnings:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/586//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/586//console

This message is automatically generated.

Improve DataXceiver error logging
-

Key: HDFS-1568
URL: https://issues.apache.org/jira/browse/HDFS-1568
Project: Hadoop HDFS
Issue Type: Improvement
Components: data-node
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Joey Echeverria
Priority: Minor
Labels: newbie
Attachments: HDFS-1568-1.patch, HDFS-1568-output-changes.patch

In supporting customers we often see things like SocketTimeoutExceptions or
EOFExceptions coming from DataXceiver, but the logging isn't very good. For
example, if we get an IOE while setting up a connection to the downstream
mirror in writeBlock, the IP of the downstream mirror isn't logged on the DN
side.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1505) saveNamespace appears to succeed even if all directories fail to save

[
https://issues.apache.org/jira/browse/HDFS-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036396#comment-13036396
]

Todd Lipcon commented on HDFS-1505:
---

Hey Matt. You're pretty close.

bq. And you're referring to the resulting fact that one could take an image
ending with txn 100, jump into the middle of a log file that went from txn 50
to 170
In theory, yes. In the current implementation, images are only saved at
boundaries of edit log segments. So if you have an image with txn 100, then
you'll have some edit log file which starts at 101, so the jump into the
middle part isn't necessary.

bq. Although it doesn't hurt to also clear the edits logs, once you have
multiple copies of the fsimage. Does your log-rolling logic automatically
delete log chunk files older than available fsimage files?
It's not implemented yet, but the idea is that a separate background thread
would be responsible for handling management of old files based on various
policies (eg remove old ones, or perhaps archive to some other location)

So, sounds like we're in agreement. Thanks.

saveNamespace appears to succeed even if all directories fail to save
-

Key: HDFS-1505
URL: https://issues.apache.org/jira/browse/HDFS-1505
Project: Hadoop HDFS
Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: Todd Lipcon
Assignee: Aaron T. Myers
Priority: Blocker
Fix For: 0.22.0

Attachments: hdfs-1505-1-test.txt, hdfs-1505-22.0.patch,
hdfs-1505-22.1.patch, hdfs-1505-22.2.patch, hdfs-1505-test.txt,
hdfs-1505-trunk.0.patch, hdfs-1505-trunk.1.patch, hdfs-1505-trunk.2.patch,
hdfs-1505-trunk.3.patch

After HDFS-1071, saveNamespace now appears to succeed even if all of the
individual directories failed to save.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-420) fuse_dfs is unable to connect to the dfs after a copying a large number of files into the dfs over fuse

2011-05-19 Thread Brian Bockelman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Bockelman updated HDFS-420:
-

Attachment: fuse_dfs_020_memleaks_v8.patch

Ok, I tested this one for awhile prior to posting it.  We have been running 
this on our ~2PB cluster with 250 machines for around 2-3 weeks.

No crashes have been reported.  No memory leaks are observed.  Unit tests pass. 
 Site admins report they are much happier with FUSE

 fuse_dfs is unable to connect to the dfs after a copying a large number of 
 files into the dfs over fuse
 ---

 Key: HDFS-420
 URL: https://issues.apache.org/jira/browse/HDFS-420
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: contrib/fuse-dfs
Affects Versions: 0.20.2
 Environment: Fedora core 10, x86_64, 2.6.27.7-134.fc10.x86_64 #1 SMP 
 (AMD 64), gcc 4.3.2, java 1.6.0 (IcedTea6 1.4 (fedora-7.b12.fc10-x86_64) 
 Runtime Environment (build 1.6.0_0-b12) OpenJDK 64-Bit Server VM (build 
 10.0-b19, mixed mode)
Reporter: Dima Brodsky
Assignee: Brian Bockelman
 Fix For: 0.20.3

 Attachments: fuse_dfs_020_memleaks.patch, 
 fuse_dfs_020_memleaks_v3.patch, fuse_dfs_020_memleaks_v8.patch


 I run the following test:
 1.  Run hadoop DFS in single node mode
 2.  start up fuse_dfs
 3.  copy my source tree, about 250 megs, into the DFS
  cp -av * /mnt/hdfs/
 in /var/log/messages I keep seeing:
 Dec 22 09:02:08 bodum fuse_dfs: ERROR: hdfs trying to utime 
 /bar/backend-trunk2/src/machinery/hadoop/output/2008/11/19 to 
 1229385138/1229963739
 and then eventually
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1333
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1333
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1333
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1333
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1209
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1209
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1333
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1209
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 and the file system hangs.  hadoop is still running and I don't see any 
 errors in it's logs.  I have to unmount the dfs and restart fuse_dfs and then 
 everything is fine again.  At some point I see the following messages in the 
 /var/log/messages:
 ERROR: dfs problem - could not close file_handle(139677114350528) for 
 /bar/backend-trunk2/src/machinery/hadoop/input/2008/12/14/actionrecordlog-8339-93825052368848-1229278807.log
  fuse_dfs.c:1464
 Dec 22 09:04:49 bodum fuse_dfs: ERROR: dfs problem - could not close 
 file_handle(139676770220176) for 
 /bar/backend-trunk2/src/machinery/hadoop/input/2008/12/14/actionrecordlog-8140-93825025883216-1229278759.log
  fuse_dfs.c:1464
 Dec 22 09:05:13 bodum fuse_dfs: ERROR: dfs problem - could not close 
 file_handle(139677114812832) for 
 /bar/backend-trunk2/src/machinery/hadoop/input/2008/12/14/actionrecordlog-8138-93825070138960-1229251587.log
  fuse_dfs.c:1464
 Is this a known issue?  Am I just flooding the system too much.  All of this 
 is being performed on a single, dual core, machine.
 Thanks!
 ttyl
 Dima

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-420) fuse_dfs is unable to connect to the dfs after a copying a large number of files into the dfs over fuse


[ 
https://issues.apache.org/jira/browse/HDFS-420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036405#comment-13036405
 ] 

Hadoop QA commented on HDFS-420:


-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12479821/fuse_dfs_020_memleaks_v8.patch
  against trunk revision 1125057.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/589//console

This message is automatically generated.

 fuse_dfs is unable to connect to the dfs after a copying a large number of 
 files into the dfs over fuse
 ---

 Key: HDFS-420
 URL: https://issues.apache.org/jira/browse/HDFS-420
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: contrib/fuse-dfs
Affects Versions: 0.20.2
 Environment: Fedora core 10, x86_64, 2.6.27.7-134.fc10.x86_64 #1 SMP 
 (AMD 64), gcc 4.3.2, java 1.6.0 (IcedTea6 1.4 (fedora-7.b12.fc10-x86_64) 
 Runtime Environment (build 1.6.0_0-b12) OpenJDK 64-Bit Server VM (build 
 10.0-b19, mixed mode)
Reporter: Dima Brodsky
Assignee: Brian Bockelman
 Fix For: 0.20.3

 Attachments: fuse_dfs_020_memleaks.patch, 
 fuse_dfs_020_memleaks_v3.patch, fuse_dfs_020_memleaks_v8.patch


 I run the following test:
 1.  Run hadoop DFS in single node mode
 2.  start up fuse_dfs
 3.  copy my source tree, about 250 megs, into the DFS
  cp -av * /mnt/hdfs/
 in /var/log/messages I keep seeing:
 Dec 22 09:02:08 bodum fuse_dfs: ERROR: hdfs trying to utime 
 /bar/backend-trunk2/src/machinery/hadoop/output/2008/11/19 to 
 1229385138/1229963739
 and then eventually
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1333
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1333
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1333
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1333
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1209
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1209
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1333
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1209
 Dec 22 09:03:49 bodum fuse_dfs: ERROR: could not connect to dfs 
 fuse_dfs.c:1037
 and the file system hangs.  hadoop is still running and I don't see any 
 errors in it's logs.  I have to unmount the dfs and restart fuse_dfs and then 
 everything is fine again.  At some point I see the following messages in the 
 /var/log/messages:
 ERROR: dfs problem - could not close file_handle(139677114350528) for 
 /bar/backend-trunk2/src/machinery/hadoop/input/2008/12/14/actionrecordlog-8339-93825052368848-1229278807.log
  fuse_dfs.c:1464
 Dec 22 09:04:49 bodum fuse_dfs: ERROR: dfs problem - could not close 
 file_handle(139676770220176) for 
 /bar/backend-trunk2/src/machinery/hadoop/input/2008/12/14/actionrecordlog-8140-93825025883216-1229278759.log
  fuse_dfs.c:1464
 Dec 22 09:05:13 bodum fuse_dfs: ERROR: dfs problem - could not close 
 file_handle(139677114812832) for

[jira] [Commented] (HDFS-1869) mkdirs should use the supplied permission for all of the created directories


[ 
https://issues.apache.org/jira/browse/HDFS-1869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036409#comment-13036409
 ] 

Todd Lipcon commented on HDFS-1869:
---

It would be great to track down which JIRA it was that broke this upstream as 
well. I don't think it could be quotas since they've been around since 0.17 or 
so iirc.

 mkdirs should use the supplied permission for all of the created directories
 

 Key: HDFS-1869
 URL: https://issues.apache.org/jira/browse/HDFS-1869
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Attachments: HDFS-1869-2.patch, HDFS-1869.patch


 Mkdirs only uses the supplied FsPermission for the last directory of the 
 path.  Paths 0..N-1 will all inherit the parent dir's permissions -even if- 
 inheritPermission is false.  This is a regression from somewhere around 
 0.20.9 and does not follow posix semantics.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1961) New architectural documentation created

[
https://issues.apache.org/jira/browse/HDFS-1961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036410#comment-13036410
]

Matt Foley commented on HDFS-1961:
--

Good start. A few suggestions:

Section 4.4: Suggest start with:
All communication between Namenode and Datanode is initiated by the Datanode,
and responded to by the Namenode. The Namenode never initiates communication to
the Datanode, although Namenode responses may include commands to the Datanode
that cause it to send further communications.

4.4.2 DataNode Command – send heartbeat.
Suggest change to DataNode sends Heartbeat.

4.4.3 DataNodeCommand – block report.
Suggest change to DataNode sends BlockReport.

4.4.4 BlockReceived.
Suggest change to DataNode notifies BlockReceived.

Section 5.2:
In the list of NN threads, calling the first one HeartBeat is a little
confusing. Please consider calling it something like Datanode Health
Management, instead. In the code it is called HeartbeatMonitor, but its job
is neither sending nor receiving heartbeats, but rather to periodically check
to make sure that every Datanode has sent a heartbeat at least once in the last
10 minutes (or as configured).
Should probably also mention the bundle of threads that provide the Namenode's
RPC service, which receives and processes all 13 kinds of communication from
Datanodes and Clients.

Section 5.3:
This [blockReceived notification] may prevent NameNode temporarily from asking
for a full block report since the receipt of a blockReceived() message
indicates that the DataNode is still alive.
That sentence isn't correct, since it is relatively unusual for the NN to ask
the DN for a block report. (It only happens when recovering from gross errors.)

Instead, suggest including in this section a brief discussion of the fact that
the DN sends a heartbeat to the NN every 3 seconds (or as configured), which
allows the NN a chance to respond with commands such as
* delete replica if a block has become over-replicated, or
* copy replica to this other DN if a block needs further replication.
And the DN initiates a BlockReport to the NN every hour (or as configured),
which prevents any divergence in the NN and DN belief about which replicas are
held by each datanode.
And yes, it also sends an immediate blockReceived notification whenever it
receives a new block, whether from a Client (file create/append), or from
another Datanode (block replication).

A blockReport() is also issued periodically as a portion of the HeartBeat.
Not exactly. The DN's heartbeat thread takes care of sending both, at the
appropriate time intervals, but they are separate RPCs to the NN.

New architectural documentation created
---

Key: HDFS-1961
URL: https://issues.apache.org/jira/browse/HDFS-1961
Project: Hadoop HDFS
Issue Type: Improvement
Components: documentation
Affects Versions: 0.21.0
Reporter: Rick Kazman
Labels: architecture, hadoop, newbie
Fix For: 0.21.0

Attachments: HDFS ArchDoc.Jira.docx

This material provides an overview of the HDFS architecture and is intended
for contributors. The goal of this document is to provide a guide to the
overall structure of the HDFS code so that contributors can more effectively
understand how changes that they are considering can be made, and the
consequences of those changes. The assumption is that the reader has a basic
understanding of HDFS, its purpose, and how it fits into the Hadoop project
suite.
An HTML version of the architectural documentation can be found at:
http://kazman.shidler.hawaii.edu/ArchDoc.html
All comments and suggestions for improvements are appreciated.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1958) Format confirmation prompt should be more lenient of its input


[ 
https://issues.apache.org/jira/browse/HDFS-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036413#comment-13036413
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1958:
--

 Excuse me - I took Nicholas's question for a joke, to be honest, given it 
 referenced high school students and didn't raise technical objections.

It is a half joke.  :) 

However, it is not as convincing as suggested to change a feature affecting 
user behavior simply because someone has reported on the mailing list.

 Format confirmation prompt should be more lenient of its input
 --

 Key: HDFS-1958
 URL: https://issues.apache.org/jira/browse/HDFS-1958
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: hdfs-1958.txt


 As reported on the mailing list, the namenode format prompt only accepts 'Y'. 
 We should also accept 'y' and 'yes' (non-case-sensitive).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1958) Format confirmation prompt should be more lenient of its input


[ 
https://issues.apache.org/jira/browse/HDFS-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036416#comment-13036416
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1958:
--

Todd, do you agree that this is a newbie issue?

 Format confirmation prompt should be more lenient of its input
 --

 Key: HDFS-1958
 URL: https://issues.apache.org/jira/browse/HDFS-1958
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: hdfs-1958.txt


 As reported on the mailing list, the namenode format prompt only accepts 'Y'. 
 We should also accept 'y' and 'yes' (non-case-sensitive).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1958) Format confirmation prompt should be more lenient of its input


[ 
https://issues.apache.org/jira/browse/HDFS-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036419#comment-13036419
 ] 

Todd Lipcon commented on HDFS-1958:
---

bq. However, it is not as convincing as suggested to change a feature affecting 
user behavior simply because someone has reported on the mailing list.

Fair enough. I'll try to reproduce the reasoning from the mailing list in the 
future.

bq. Todd, do you agree that this is a newbie issue

Yes. But I already addressed it so no need to retroactively tag it as such (IMO 
the point of the newbie label is just to help new contributors find open JIRAs 
that might be easy to start with).

 Format confirmation prompt should be more lenient of its input
 --

 Key: HDFS-1958
 URL: https://issues.apache.org/jira/browse/HDFS-1958
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: hdfs-1958.txt


 As reported on the mailing list, the namenode format prompt only accepts 'Y'. 
 We should also accept 'y' and 'yes' (non-case-sensitive).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HDFS-1935) Build should not redownload ivy on every invocation


 [ 
https://issues.apache.org/jira/browse/HDFS-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned HDFS-1935:
-

Assignee: (was: Todd Lipcon)

 Build should not redownload ivy on every invocation
 ---

 Key: HDFS-1935
 URL: https://issues.apache.org/jira/browse/HDFS-1935
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Priority: Trivial
  Labels: newbie
 Fix For: 0.22.0

 Attachments: hdfs-1935.txt


 Currently we re-download ivy every time we build. If the jar already exists, 
 we should skip this.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1371) One bad node can incorrectly flag many files as corrupt

[
https://issues.apache.org/jira/browse/HDFS-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036430#comment-13036430
]

Hadoop QA commented on HDFS-1371:
-

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12479689/HDFS-1371.0518.2.patch
against trunk revision 1125057.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 11 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these core unit tests:
org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
org.apache.hadoop.hdfs.TestFileConcurrentReader
org.apache.hadoop.hdfs.TestHDFSTrash

+1 contrib tests. The patch passed contrib unit tests.

+1 system test framework. The patch passed system test framework compile.

Test results:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/588//testReport/
Findbugs warnings:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/588//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/588//console

This message is automatically generated.

One bad node can incorrectly flag many files as corrupt
---

Key: HDFS-1371
URL: https://issues.apache.org/jira/browse/HDFS-1371
Project: Hadoop HDFS
Issue Type: Bug
Components: hdfs client, name-node
Affects Versions: 0.20.1, 0.23.0
Environment: yahoo internal version
[knoguchi@gwgd4003 ~]$ hadoop version
Hadoop 0.20.104.3.1007030707
Reporter: Koji Noguchi
Assignee: Tanping Wang
Fix For: 0.23.0

Attachments: HDFS-1371.04252011.patch, HDFS-1371.0503.patch,
HDFS-1371.0513.patch, HDFS-1371.0515.patch, HDFS-1371.0517.2.patch,
HDFS-1371.0517.patch, HDFS-1371.0518.2.patch, HDFS-1371.0518.patch

On our cluster, 12 files were reported as corrupt by fsck even though the
replicas on the datanodes were healthy.
Turns out that all the replicas (12 files x 3 replicas per file) were
reported corrupt from one node.
Surprisingly, these files were still readable/accessible from dfsclient
(-get/-cat) without any problems.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1568) Improve DataXceiver error logging


[ 
https://issues.apache.org/jira/browse/HDFS-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036434#comment-13036434
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1568:
--

- Could you not reformat the message in this?  Otherwise, it is hard to review 
the patch.  You may fix the message format in a separated JIRA.
{code}
-  block +  to  +
-s.getInetAddress() + :\n + 
-StringUtils.stringifyException(ioe) );
+   block +  to  +
+   remoteAddress + :\n + 
+   StringUtils.stringifyException(ioe) );
{code}

 Improve DataXceiver error logging
 -

 Key: HDFS-1568
 URL: https://issues.apache.org/jira/browse/HDFS-1568
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Joey Echeverria
Priority: Minor
  Labels: newbie
 Attachments: HDFS-1568-1.patch, HDFS-1568-output-changes.patch


 In supporting customers we often see things like SocketTimeoutExceptions or 
 EOFExceptions coming from DataXceiver, but the logging isn't very good. For 
 example, if we get an IOE while setting up a connection to the downstream 
 mirror in writeBlock, the IP of the downstream mirror isn't logged on the DN 
 side.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1371) One bad node can incorrectly flag many files as corrupt

2011-05-19 Thread Tanping Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036444#comment-13036444
 ] 

Tanping Wang commented on HDFS-1371:


These three tests are already failing on trunk.

 One bad node can incorrectly flag many files as corrupt
 ---

 Key: HDFS-1371
 URL: https://issues.apache.org/jira/browse/HDFS-1371
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client, name-node
Affects Versions: 0.20.1, 0.23.0
 Environment: yahoo internal version 
 [knoguchi@gwgd4003 ~]$ hadoop version
 Hadoop 0.20.104.3.1007030707
Reporter: Koji Noguchi
Assignee: Tanping Wang
 Fix For: 0.23.0

 Attachments: HDFS-1371.04252011.patch, HDFS-1371.0503.patch, 
 HDFS-1371.0513.patch, HDFS-1371.0515.patch, HDFS-1371.0517.2.patch, 
 HDFS-1371.0517.patch, HDFS-1371.0518.2.patch, HDFS-1371.0518.patch


 On our cluster, 12 files were reported as corrupt by fsck even though the 
 replicas on the datanodes were healthy.
 Turns out that all the replicas (12 files x 3 replicas per file) were 
 reported corrupt from one node.
 Surprisingly, these files were still readable/accessible from dfsclient 
 (-get/-cat) without any problems.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1963) HDFS rpm integration project

[
https://issues.apache.org/jira/browse/HDFS-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036446#comment-13036446
]

Hadoop QA commented on HDFS-1963:
-

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12479801/HDFS-1963.patch
against trunk revision 1125057.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 14 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

-1 release audit. The applied patch generated 2 release audit warnings
(more than the trunk's current 0 warnings).

-1 core tests. The patch failed these core unit tests:
org.apache.hadoop.cli.TestHDFSCLI
org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
org.apache.hadoop.hdfs.TestFileConcurrentReader
org.apache.hadoop.hdfs.TestHDFSTrash

+1 contrib tests. The patch passed contrib unit tests.

+1 system test framework. The patch passed system test framework compile.

Test results:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/587//testReport/
Release audit warnings:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/587//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Findbugs warnings:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/587//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/587//console

This message is automatically generated.

HDFS rpm integration project

Key: HDFS-1963
URL: https://issues.apache.org/jira/browse/HDFS-1963
Project: Hadoop HDFS
Issue Type: New Feature
Components: build
Environment: Java 6, RHEL 5.5
Reporter: Eric Yang
Assignee: Eric Yang
Attachments: HDFS-1963.patch

This jira is corresponding to HADOOP-6255 and associated directory layout
change. The patch for creating HDFS rpm packaging should be posted here for
patch test build to verify against hdfs svn trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file


[ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036453#comment-13036453
 ] 

Todd Lipcon commented on HDFS-1057:
---

Sam seems to be correct that there's some kind of leak going on. lsof on the 
java process shows several hundred unix sockets open.

 Concurrent readers hit ChecksumExceptions if following a writer to very end 
 of file
 ---

 Key: HDFS-1057
 URL: https://issues.apache.org/jira/browse/HDFS-1057
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node
Affects Versions: 0.20-append, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
 Fix For: 0.20-append, 0.21.0, 0.22.0

 Attachments: HDFS-1057-0.20-append.patch, 
 conurrent-reader-patch-1.txt, conurrent-reader-patch-2.txt, 
 conurrent-reader-patch-3.txt, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, 
 hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt, hdfs-1057-trunk-5.txt, 
 hdfs-1057-trunk-6.txt


 In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
 calling flush(). Therefore, if there is a concurrent reader, it's possible to 
 race here - the reader will see the new length while those bytes are still in 
 the buffers of BlockReceiver. Thus the client will potentially see checksum 
 errors or EOFs. Additionally, the last checksum chunk of the file is made 
 accessible to readers even though it is not stable.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

2011-05-19 Thread sam rash (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036461#comment-13036461
 ] 

sam rash commented on HDFS-1057:


todd: thanks for digging into this

 Concurrent readers hit ChecksumExceptions if following a writer to very end 
 of file
 ---

 Key: HDFS-1057
 URL: https://issues.apache.org/jira/browse/HDFS-1057
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node
Affects Versions: 0.20-append, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
 Fix For: 0.20-append, 0.21.0, 0.22.0

 Attachments: HDFS-1057-0.20-append.patch, 
 conurrent-reader-patch-1.txt, conurrent-reader-patch-2.txt, 
 conurrent-reader-patch-3.txt, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, 
 hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt, hdfs-1057-trunk-5.txt, 
 hdfs-1057-trunk-6.txt


 In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
 calling flush(). Therefore, if there is a concurrent reader, it's possible to 
 race here - the reader will see the new length while those bytes are still in 
 the buffers of BlockReceiver. Thus the client will potentially see checksum 
 errors or EOFs. Additionally, the last checksum chunk of the file is made 
 accessible to readers even though it is not stable.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1906) Remove logging exception stack trace when one of the datanode targets to read from is not reachable


 [ 
https://issues.apache.org/jira/browse/HDFS-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-1906:
--

Attachment: HDFS-1906.rel205.patch

Patch for 0.20.205

 Remove logging exception stack trace when one of the datanode targets to read 
 from is not reachable
 ---

 Key: HDFS-1906
 URL: https://issues.apache.org/jira/browse/HDFS-1906
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client
Affects Versions: 0.20.203.1
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1906.2.patch, HDFS-1906.patch, 
 HDFS-1906.rel205.patch


 When client fails to connect to one of the datanodes from the list of block 
 locations returned, exception stack trace is printed in the client log. This 
 is an expected failure scenario that is handled at the client, by going to 
 the next location. Printing entire stack trace is unnecessary and just 
 printing the exception message should be sufficient.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1958) Format confirmation prompt should be more lenient of its input


[ 
https://issues.apache.org/jira/browse/HDFS-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036477#comment-13036477
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1958:
--

 OK, another comparison: mke2fs doesn't ask for confirmation at all. I checked 
 this across ext2, ext3, and ntfs.

Not asking for confirmation and case insensitive are also different.

Moreover, since the -format command is there for years, I wonder if there are 
some admins already taking advantages of the fact that 'y' won't format.  For 
example, the 'yes' command outputs lower case y's.


 Yes. But I already addressed it so no need to retroactively tag it as such 
 (IMO the point of the newbie label is just to help new contributors find open 
 JIRAs that might be easy to start with).

Okay, I think you like to leave the easy issues for the new contributors.

 Format confirmation prompt should be more lenient of its input
 --

 Key: HDFS-1958
 URL: https://issues.apache.org/jira/browse/HDFS-1958
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.23.0

 Attachments: hdfs-1958.txt


 As reported on the mailing list, the namenode format prompt only accepts 'Y'. 
 We should also accept 'y' and 'yes' (non-case-sensitive).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1906) Remove logging exception stack trace when one of the datanode targets to read from is not reachable


[ 
https://issues.apache.org/jira/browse/HDFS-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036485#comment-13036485
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1906:
--

 Patch for 0.20.205
+1

 Remove logging exception stack trace when one of the datanode targets to read 
 from is not reachable
 ---

 Key: HDFS-1906
 URL: https://issues.apache.org/jira/browse/HDFS-1906
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client
Affects Versions: 0.20.203.1
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1906.2.patch, HDFS-1906.patch, 
 HDFS-1906.rel205.patch


 When client fails to connect to one of the datanodes from the list of block 
 locations returned, exception stack trace is printed in the client log. This 
 is an expected failure scenario that is handled at the client, by going to 
 the next location. Printing entire stack trace is unnecessary and just 
 printing the exception message should be sufficient.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-1964) Incorrect HTML unescaping in DatanodeJspHelper.java

Incorrect HTML unescaping in DatanodeJspHelper.java
---

 Key: HDFS-1964
 URL: https://issues.apache.org/jira/browse/HDFS-1964
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Fix For: 0.22.0, 0.23.0


HDFS-1575 introduced some HTML unescaping of parameters so that viewing a file 
would work for paths containing HTML-escaped characters, but in two of the 
places did the unescaping either too early or too late.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1371) One bad node can incorrectly flag many files as corrupt

2011-05-19 Thread Jitendra Nath Pandey (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036491#comment-13036491
 ] 

Jitendra Nath Pandey commented on HDFS-1371:


+1

 One bad node can incorrectly flag many files as corrupt
 ---

 Key: HDFS-1371
 URL: https://issues.apache.org/jira/browse/HDFS-1371
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client, name-node
Affects Versions: 0.20.1, 0.23.0
 Environment: yahoo internal version 
 [knoguchi@gwgd4003 ~]$ hadoop version
 Hadoop 0.20.104.3.1007030707
Reporter: Koji Noguchi
Assignee: Tanping Wang
 Fix For: 0.23.0

 Attachments: HDFS-1371.04252011.patch, HDFS-1371.0503.patch, 
 HDFS-1371.0513.patch, HDFS-1371.0515.patch, HDFS-1371.0517.2.patch, 
 HDFS-1371.0517.patch, HDFS-1371.0518.2.patch, HDFS-1371.0518.patch


 On our cluster, 12 files were reported as corrupt by fsck even though the 
 replicas on the datanodes were healthy.
 Turns out that all the replicas (12 files x 3 replicas per file) were 
 reported corrupt from one node.
 Surprisingly, these files were still readable/accessible from dfsclient 
 (-get/-cat) without any problems.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file


[ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036492#comment-13036492
 ] 

Todd Lipcon commented on HDFS-1057:
---

Actually, it looks like the leak in 7146 pushed this over the edge. But, even 
with that patch, if I lsof the java process as it runs, I see it hit 800 or 
so localhost TCP connections in ESTABLISHED state while running this test case. 
So, needs more investigation yet.

 Concurrent readers hit ChecksumExceptions if following a writer to very end 
 of file
 ---

 Key: HDFS-1057
 URL: https://issues.apache.org/jira/browse/HDFS-1057
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node
Affects Versions: 0.20-append, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
 Fix For: 0.20-append, 0.21.0, 0.22.0

 Attachments: HDFS-1057-0.20-append.patch, 
 conurrent-reader-patch-1.txt, conurrent-reader-patch-2.txt, 
 conurrent-reader-patch-3.txt, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, 
 hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt, hdfs-1057-trunk-5.txt, 
 hdfs-1057-trunk-6.txt


 In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
 calling flush(). Therefore, if there is a concurrent reader, it's possible to 
 race here - the reader will see the new length while those bytes are still in 
 the buffers of BlockReceiver. Thus the client will potentially see checksum 
 errors or EOFs. Additionally, the last checksum chunk of the file is made 
 accessible to readers even though it is not stable.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1952) FSEditLog.open() appears to succeed even if all EDITS directories fail


[ 
https://issues.apache.org/jira/browse/HDFS-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036493#comment-13036493
 ] 

Hadoop QA commented on HDFS-1952:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12479839/hdfs-1952.patch
  against trunk revision 1125057.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/591//console

This message is automatically generated.

 FSEditLog.open() appears to succeed even if all EDITS directories fail
 --

 Key: HDFS-1952
 URL: https://issues.apache.org/jira/browse/HDFS-1952
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: Matt Foley
Assignee: Andrew Wang
  Labels: newbie
 Attachments: hdfs-1952.patch, hdfs-1952.patch


 FSEditLog.open() appears to succeed even if all of the individual 
 directories failed to allow creation of an EditLogOutputStream.  The problem 
 and solution are essentially similar to that of HDFS-1505.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1575) viewing block from web UI broken


 [ 
https://issues.apache.org/jira/browse/HDFS-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-1575:
-

Attachment: hdfs-1575-22.0.patch

What's attached is a faithful back-port of the trunk commit. In the course of 
doing this back-port I identified a bug, which I've filed under HDFS-1964. 
Let's commit this patch to branch-0.22 and then I'll back-port the bug fix.

 viewing block from web UI broken
 

 Key: HDFS-1575
 URL: https://issues.apache.org/jira/browse/HDFS-1575
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Aaron T. Myers
Priority: Blocker
 Fix For: 0.22.0

 Attachments: HDFS-1575, hdfs-1575-22.0.patch, 
 hdfs-1575-trunk.0.patch, hdfs-1575-trunk.1.patch, hdfs-1575-trunk.2.patch, 
 hdfs-1575-trunk.3.patch


 DatanodeJspHelper seems to expect the file path to be in the path info of 
 the HttpRequest, rather than in a parameter. I see the following exception 
 when visiting the URL 
 {{http://localhost.localdomain:50075/browseBlock.jsp?blockId=5006108823351810567blockSize=20genstamp=1001filename=%2Fuser%2Ftodd%2FissuedatanodePort=50010namenodeInfoPort=50070}}
 java.io.FileNotFoundException: File does not exist: /
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInternal(FSNamesystem.java:834)
 ...
   at 
 org.apache.hadoop.hdfs.server.datanode.DatanodeJspHelper.generateFileDetails(DatanodeJspHelper.java:258)
   at 
 org.apache.hadoop.hdfs.server.datanode.browseBlock_jsp._jspService(browseBlock_jsp.java:79)
   at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-593) Support for getting user home dir from server side

2011-05-19 Thread Sanjay Radia (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036503#comment-13036503
]

Sanjay Radia commented on HDFS-593:
---

Lets revist this issue.
When writing the tests for viewfs's trashbin, (see HADOOP-7284), I had to take
into account that
the tests ran on mac or linux boxes or hdfs each of which have different
notions of the home directory.

When this Jira was filed, the proposal was that the home dir is SS config. Made
sense to me.
With viewfs (ie client-side mount table) there is no server side and further
the client side mount table points to multiple file servers. Since viewfs is
configured via config variables it is quite easy to add a config variable for
this. I proposed that in HADOOP-7284 and Todd aggreed.

But I think this topic deserves a fresh look:
* For HDFS, homme-dir is a SS property with a default of /user, for viewfs
determined from viewfs's config, and for localfs, figured out dynamically.
* Home dir is config variable with a default of /user and for viewfs determined
from its config so that it can adapt to mounts of localfs and hdfs.

Support for getting user home dir from server side
--

Key: HDFS-593
URL: https://issues.apache.org/jira/browse/HDFS-593
Project: Hadoop HDFS
Issue Type: Improvement
Components: hdfs client, name-node
Reporter: Kan Zhang

This is a sub-task of HADOOP-4952. Currently the Path of user home dir is
constructed on the client side using convention /user/$USER. HADOOP-4952
calls for it to be retrieved from server side.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1952) FSEditLog.open() appears to succeed even if all EDITS directories fail

2011-05-19 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-1952:
--

Attachment: hdfs-1952.patch

Used --strip-prefix this time. Tested application to trunk with patch -p0  
hdfs-1952.patch, hopefully Hudson likes it.

 FSEditLog.open() appears to succeed even if all EDITS directories fail
 --

 Key: HDFS-1952
 URL: https://issues.apache.org/jira/browse/HDFS-1952
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: Matt Foley
Assignee: Andrew Wang
  Labels: newbie
 Attachments: hdfs-1952.patch, hdfs-1952.patch, hdfs-1952.patch


 FSEditLog.open() appears to succeed even if all of the individual 
 directories failed to allow creation of an EditLogOutputStream.  The problem 
 and solution are essentially similar to that of HDFS-1505.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1575) viewing block from web UI broken

[
https://issues.apache.org/jira/browse/HDFS-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036507#comment-13036507
]

Hadoop QA commented on HDFS-1575:
-

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12479841/hdfs-1575-22.0.patch
against trunk revision 1125057.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 5 new or modified tests.

-1 patch. The patch command could not apply the patch.

Console output:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/593//console

This message is automatically generated.

viewing block from web UI broken

Attachments: HDFS-1575, hdfs-1575-22.0.patch,
hdfs-1575-trunk.0.patch, hdfs-1575-trunk.1.patch, hdfs-1575-trunk.2.patch,
hdfs-1575-trunk.3.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1964) Incorrect HTML unescaping in DatanodeJspHelper.java


 [ 
https://issues.apache.org/jira/browse/HDFS-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-1964:
-

Attachment: hdfs-1964-trunk.0.patch

Patch addressing the issue.

 Incorrect HTML unescaping in DatanodeJspHelper.java
 ---

 Key: HDFS-1964
 URL: https://issues.apache.org/jira/browse/HDFS-1964
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Fix For: 0.22.0, 0.23.0

 Attachments: hdfs-1964-trunk.0.patch


 HDFS-1575 introduced some HTML unescaping of parameters so that viewing a 
 file would work for paths containing HTML-escaped characters, but in two of 
 the places did the unescaping either too early or too late.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1964) Incorrect HTML unescaping in DatanodeJspHelper.java


 [ 
https://issues.apache.org/jira/browse/HDFS-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-1964:
-

Status: Patch Available  (was: Open)

 Incorrect HTML unescaping in DatanodeJspHelper.java
 ---

 Key: HDFS-1964
 URL: https://issues.apache.org/jira/browse/HDFS-1964
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Fix For: 0.22.0, 0.23.0

 Attachments: hdfs-1964-trunk.0.patch


 HDFS-1575 introduced some HTML unescaping of parameters so that viewing a 
 file would work for paths containing HTML-escaped characters, but in two of 
 the places did the unescaping either too early or too late.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1963) HDFS rpm integration project


 [ 
https://issues.apache.org/jira/browse/HDFS-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated HDFS-1963:


Attachment: HDFS-1963-1.patch

Store config templates in $PREFIX/share/hadoop/templates, and change related 
script to use the new location.

 HDFS rpm integration project
 

 Key: HDFS-1963
 URL: https://issues.apache.org/jira/browse/HDFS-1963
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: build
 Environment: Java 6, RHEL 5.5
Reporter: Eric Yang
Assignee: Eric Yang
 Attachments: HDFS-1963-1.patch, HDFS-1963.patch


 This jira is corresponding to HADOOP-6255 and associated directory layout 
 change.  The patch for creating HDFS rpm packaging should be posted here for 
 patch test build to verify against hdfs svn trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file

[
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036525#comment-13036525
]

Todd Lipcon commented on HDFS-1057:
---

aha! I think I understand what's going on here!

The test has a thread which continually re-opens the file which is being
written to. Since the file's in the middle of being written, it makes an RPC to
the DataNode in order to determine the visible length of the file. This RPC is
authenticated using the block token which came back in the LocatedBlocks object
as the security ticket.

When this RPC hits the IPC layer, it looks at its existing connections and sees
none that can be re-used, since the block token differs between the two
requesters. Hence, it reconnects, and we end up with hundreds or thousands of
IPC connections to the datanode.

This also explains why Sam doesn't see it on his 0.20 append branch -- there
are no block tokens there, so the RPC connection is getting reused properly.

I'll file another JIRA about this issue.

Concurrent readers hit ChecksumExceptions if following a writer to very end
of file
---

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-1965) IPCs done using block token-based tickets can't reuse connections

IPCs done using block token-based tickets can't reuse connections
-

 Key: HDFS-1965
 URL: https://issues.apache.org/jira/browse/HDFS-1965
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: security
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.22.0


This is the reason that TestFileConcurrentReaders has been failing a lot. 
Reproducing a comment from HDFS-1057:

The test has a thread which continually re-opens the file which is being 
written to. Since the file's in the middle of being written, it makes an RPC to 
the DataNode in order to determine the visible length of the file. This RPC is 
authenticated using the block token which came back in the LocatedBlocks object 
as the security ticket.

When this RPC hits the IPC layer, it looks at its existing connections and sees 
none that can be re-used, since the block token differs between the two 
requesters. Hence, it reconnects, and we end up with hundreds or thousands of 
IPC connections to the datanode.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1057) Concurrent readers hit ChecksumExceptions if following a writer to very end of file


[ 
https://issues.apache.org/jira/browse/HDFS-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036530#comment-13036530
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1057:
--

Todd, well done!  Thanks for investigating it.

 Concurrent readers hit ChecksumExceptions if following a writer to very end 
 of file
 ---

 Key: HDFS-1057
 URL: https://issues.apache.org/jira/browse/HDFS-1057
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node
Affects Versions: 0.20-append, 0.21.0, 0.22.0
Reporter: Todd Lipcon
Assignee: sam rash
Priority: Blocker
 Fix For: 0.20-append, 0.21.0, 0.22.0

 Attachments: HDFS-1057-0.20-append.patch, 
 conurrent-reader-patch-1.txt, conurrent-reader-patch-2.txt, 
 conurrent-reader-patch-3.txt, hdfs-1057-trunk-1.txt, hdfs-1057-trunk-2.txt, 
 hdfs-1057-trunk-3.txt, hdfs-1057-trunk-4.txt, hdfs-1057-trunk-5.txt, 
 hdfs-1057-trunk-6.txt


 In BlockReceiver.receivePacket, it calls replicaInfo.setBytesOnDisk before 
 calling flush(). Therefore, if there is a concurrent reader, it's possible to 
 race here - the reader will see the new length while those bytes are still in 
 the buffers of BlockReceiver. Thus the client will potentially see checksum 
 errors or EOFs. Additionally, the last checksum chunk of the file is made 
 accessible to readers even though it is not stable.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1965) IPCs done using block token-based tickets can't reuse connections

[
https://issues.apache.org/jira/browse/HDFS-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036531#comment-13036531
]

Todd Lipcon commented on HDFS-1965:
---

I can think of a couple possible solutions:

a) make the methods that operate on a block take an additional parameter to
contain block tokens, rather than using the normal token selector mechanism
that scopes credentials on a per-connection basis. This has the advantage that
we can even re-use an IPC connection across different blocks.

b) when the client creates an IPC proxy to a DN, it can explicitly configure
the maxIdleTime to 0 so that we don't leave connections hanging around after
the call completes. This is less efficient than option A above, but it probably
doesn't matter much for this use case.

IPCs done using block token-based tickets can't reuse connections
-

Key: HDFS-1965
URL: https://issues.apache.org/jira/browse/HDFS-1965
Project: Hadoop HDFS
Issue Type: Bug
Components: security
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
Fix For: 0.22.0

This is the reason that TestFileConcurrentReaders has been failing a lot.
Reproducing a comment from HDFS-1057:
The test has a thread which continually re-opens the file which is being
written to. Since the file's in the middle of being written, it makes an RPC
to the DataNode in order to determine the visible length of the file. This
RPC is authenticated using the block token which came back in the
LocatedBlocks object as the security ticket.
When this RPC hits the IPC layer, it looks at its existing connections and
sees none that can be re-used, since the block token differs between the two
requesters. Hence, it reconnects, and we end up with hundreds or thousands of
IPC connections to the datanode.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-1966) Encapsulate individual DataTransferProtocol op header

Encapsulate individual DataTransferProtocol op header
-

 Key: HDFS-1966
 URL: https://issues.apache.org/jira/browse/HDFS-1966
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node, hdfs client
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE


It will make a clear distinction between the variables used in the protocol and 
the others.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1371) One bad node can incorrectly flag many files as corrupt

2011-05-19 Thread Jitendra Nath Pandey (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036542#comment-13036542
 ] 

Jitendra Nath Pandey commented on HDFS-1371:


I have committed this. Thanks to Tanping!

 One bad node can incorrectly flag many files as corrupt
 ---

 Key: HDFS-1371
 URL: https://issues.apache.org/jira/browse/HDFS-1371
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client, name-node
Affects Versions: 0.20.1, 0.23.0
 Environment: yahoo internal version 
 [knoguchi@gwgd4003 ~]$ hadoop version
 Hadoop 0.20.104.3.1007030707
Reporter: Koji Noguchi
Assignee: Tanping Wang
 Fix For: 0.23.0

 Attachments: HDFS-1371.04252011.patch, HDFS-1371.0503.patch, 
 HDFS-1371.0513.patch, HDFS-1371.0515.patch, HDFS-1371.0517.2.patch, 
 HDFS-1371.0517.patch, HDFS-1371.0518.2.patch, HDFS-1371.0518.patch


 On our cluster, 12 files were reported as corrupt by fsck even though the 
 replicas on the datanodes were healthy.
 Turns out that all the replicas (12 files x 3 replicas per file) were 
 reported corrupt from one node.
 Surprisingly, these files were still readable/accessible from dfsclient 
 (-get/-cat) without any problems.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1371) One bad node can incorrectly flag many files as corrupt

2011-05-19 Thread Jitendra Nath Pandey (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-1371:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 One bad node can incorrectly flag many files as corrupt
 ---

 Key: HDFS-1371
 URL: https://issues.apache.org/jira/browse/HDFS-1371
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client, name-node
Affects Versions: 0.20.1, 0.23.0
 Environment: yahoo internal version 
 [knoguchi@gwgd4003 ~]$ hadoop version
 Hadoop 0.20.104.3.1007030707
Reporter: Koji Noguchi
Assignee: Tanping Wang
 Fix For: 0.23.0

 Attachments: HDFS-1371.04252011.patch, HDFS-1371.0503.patch, 
 HDFS-1371.0513.patch, HDFS-1371.0515.patch, HDFS-1371.0517.2.patch, 
 HDFS-1371.0517.patch, HDFS-1371.0518.2.patch, HDFS-1371.0518.patch


 On our cluster, 12 files were reported as corrupt by fsck even though the 
 replicas on the datanodes were healthy.
 Turns out that all the replicas (12 files x 3 replicas per file) were 
 reported corrupt from one node.
 Surprisingly, these files were still readable/accessible from dfsclient 
 (-get/-cat) without any problems.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1966) Encapsulate individual DataTransferProtocol op header


 [ 
https://issues.apache.org/jira/browse/HDFS-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-1966:
-

Attachment: h1966_20110519.patch

h1966_20110519.patch: added {{CopyBlockHeader}} for illustrating the idea.

 Encapsulate individual DataTransferProtocol op header
 -

 Key: HDFS-1966
 URL: https://issues.apache.org/jira/browse/HDFS-1966
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node, hdfs client
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h1966_20110519.patch


 It will make a clear distinction between the variables used in the protocol 
 and the others.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1877) Create a functional test for file read/write

[
https://issues.apache.org/jira/browse/HDFS-1877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036546#comment-13036546
]

Hadoop QA commented on HDFS-1877:
-

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12479831/TestWriteRead.patch
against trunk revision 1125057.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these core unit tests:
org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
org.apache.hadoop.hdfs.TestFileConcurrentReader
org.apache.hadoop.hdfs.TestHDFSTrash

+1 contrib tests. The patch passed contrib unit tests.

+1 system test framework. The patch passed system test framework compile.

Test results:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/590//testReport/
Findbugs warnings:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/590//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/590//console

This message is automatically generated.

Create a functional test for file read/write

Key: HDFS-1877
URL: https://issues.apache.org/jira/browse/HDFS-1877
Project: Hadoop HDFS
Issue Type: Test
Components: test
Affects Versions: 0.22.0
Reporter: CW Chung
Priority: Minor
Attachments: TestWriteRead.java, TestWriteRead.patch,
TestWriteRead.patch

It would be a great to have a tool, running on a real grid, to perform
function test (and stress tests to certain extent) for the file operations.
The tool would be written in Java and makes HDFS API calls to read, write,
append, hflush hadoop files. The tool would be usable standalone, or as a
building block for other regression or stress test suites (written in shell,
perl, python, etc).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1952) FSEditLog.open() appears to succeed even if all EDITS directories fail


[ 
https://issues.apache.org/jira/browse/HDFS-1952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036550#comment-13036550
 ] 

Matt Foley commented on HDFS-1952:
--

Sorry I missed this the first time.  It's minor, so you don't have to re-spin 
the patch just for this, but for future reference:
Per the coding guidelines 
(http://wiki.apache.org/hadoop/HowToContribute#Making_Changes) please add { } 
after if statements, even single-line ones.  Thanks.

+1 pending Hudson test-patch results.

 FSEditLog.open() appears to succeed even if all EDITS directories fail
 --

 Key: HDFS-1952
 URL: https://issues.apache.org/jira/browse/HDFS-1952
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: Matt Foley
Assignee: Andrew Wang
  Labels: newbie
 Attachments: hdfs-1952.patch, hdfs-1952.patch, hdfs-1952.patch


 FSEditLog.open() appears to succeed even if all of the individual 
 directories failed to allow creation of an EditLogOutputStream.  The problem 
 and solution are essentially similar to that of HDFS-1505.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1877) Create a functional test for file read/write


[ 
https://issues.apache.org/jira/browse/HDFS-1877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036552#comment-13036552
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1877:
--

CW, please grant license to ASF for your latest patch.

 Create a functional test for file read/write
 

 Key: HDFS-1877
 URL: https://issues.apache.org/jira/browse/HDFS-1877
 Project: Hadoop HDFS
  Issue Type: Test
  Components: test
Reporter: CW Chung
Priority: Minor
 Attachments: TestWriteRead.java, TestWriteRead.patch, 
 TestWriteRead.patch


 It would be a great to have a tool, running on a real grid, to perform 
 function test (and stress tests to certain extent) for the file operations. 
 The tool would be written in Java and makes HDFS API calls to read, write, 
 append, hflush hadoop files. The tool would be usable standalone, or as a 
 building block for other regression or stress test suites (written in shell, 
 perl, python, etc).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1877) Create a functional test for file read/write


 [ 
https://issues.apache.org/jira/browse/HDFS-1877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-1877:
-

Affects Version/s: (was: 0.22.0)
 Assignee: CW Chung

 Create a functional test for file read/write
 

 Key: HDFS-1877
 URL: https://issues.apache.org/jira/browse/HDFS-1877
 Project: Hadoop HDFS
  Issue Type: Test
  Components: test
Reporter: CW Chung
Assignee: CW Chung
Priority: Minor
 Attachments: TestWriteRead.java, TestWriteRead.patch, 
 TestWriteRead.patch


 It would be a great to have a tool, running on a real grid, to perform 
 function test (and stress tests to certain extent) for the file operations. 
 The tool would be written in Java and makes HDFS API calls to read, write, 
 append, hflush hadoop files. The tool would be usable standalone, or as a 
 building block for other regression or stress test suites (written in shell, 
 perl, python, etc).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1965) IPCs done using block token-based tickets can't reuse connections

[
https://issues.apache.org/jira/browse/HDFS-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036554#comment-13036554
]

Todd Lipcon commented on HDFS-1965:
---

I implemented option (b) and have a test case that shows that it fixes the
problem...

BUT: the real DFSInputStream code seems to call RPC.stopProxy() after it uses
the proxy, which should also avoid this issue. Doing so in my test case makes
the case pass without any other fix. So there's still some mystery.

IPCs done using block token-based tickets can't reuse connections
-

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1921) Save namespace can cause NN to be unable to come up on restart

[
https://issues.apache.org/jira/browse/HDFS-1921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036561#comment-13036561
]

Aaron T. Myers commented on HDFS-1921:
--

Sure, Matt. Here's the output from test-patch on branch-0.22:

{noformat}
+1 overall.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 system test framework. The patch passed system test framework compile.
{noformat}

Save namespace can cause NN to be unable to come up on restart
--

Key: HDFS-1921
URL: https://issues.apache.org/jira/browse/HDFS-1921
Project: Hadoop HDFS
Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: Aaron T. Myers
Assignee: Matt Foley
Priority: Blocker
Fix For: 0.22.0, 0.23.0

Attachments: hdfs-1505-1-test.txt, hdfs-1921-2.patch,
hdfs-1921-2_v22.patch, hdfs-1921.txt, hdfs1921_v23.patch, hdfs1921_v23.patch

I discovered this in the course of trying to implement a fix for HDFS-1505.
Per the comment for {{FSImage.saveNamespace(...)}}, the algorithm for save
namespace proceeds in the following order:
# rename current to lastcheckpoint.tmp for all of them,
# save image and recreate edits for all of them,
# rename lastcheckpoint.tmp to previous.checkpoint.
The problem is that step 3 occurs regardless of whether or not an error
occurs for all storage directories in step 2. Upon restart, the NN will see
non-existent or corrupt {{current}} directories, and no
{{lastcheckpoint.tmp}} directories, and so will conclude that the storage
directories are not formatted.
This issue appears to be present on both 0.22 and 0.23. This should arguably
be a 0.22/0.23 blocker.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1964) Incorrect HTML unescaping in DatanodeJspHelper.java

2011-05-19 Thread Eli Collins (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036576#comment-13036576
 ] 

Eli Collins commented on HDFS-1964:
---

+1  lgtm

 Incorrect HTML unescaping in DatanodeJspHelper.java
 ---

 Key: HDFS-1964
 URL: https://issues.apache.org/jira/browse/HDFS-1964
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Fix For: 0.22.0, 0.23.0

 Attachments: hdfs-1964-trunk.0.patch


 HDFS-1575 introduced some HTML unescaping of parameters so that viewing a 
 file would work for paths containing HTML-escaped characters, but in two of 
 the places did the unescaping either too early or too late.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1965) IPCs done using block token-based tickets can't reuse connections


 [ 
https://issues.apache.org/jira/browse/HDFS-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-1965:
--

Status: Patch Available  (was: Open)

 IPCs done using block token-based tickets can't reuse connections
 -

 Key: HDFS-1965
 URL: https://issues.apache.org/jira/browse/HDFS-1965
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: security
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.22.0

 Attachments: hdfs-1965.txt


 This is the reason that TestFileConcurrentReaders has been failing a lot. 
 Reproducing a comment from HDFS-1057:
 The test has a thread which continually re-opens the file which is being 
 written to. Since the file's in the middle of being written, it makes an RPC 
 to the DataNode in order to determine the visible length of the file. This 
 RPC is authenticated using the block token which came back in the 
 LocatedBlocks object as the security ticket.
 When this RPC hits the IPC layer, it looks at its existing connections and 
 sees none that can be re-used, since the block token differs between the two 
 requesters. Hence, it reconnects, and we end up with hundreds or thousands of 
 IPC connections to the datanode.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1965) IPCs done using block token-based tickets can't reuse connections

[
https://issues.apache.org/jira/browse/HDFS-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Todd Lipcon updated HDFS-1965:
--

Attachment: hdfs-1965.txt

Turns out the reason that RPC.stopProxy isn't effective in real life is that
the WritableRpcEngine Client objects are cached in ClientCache with keys that
aren't tied to principals. So, stopProxy doesn't actually cause the connection
to disconnect. I'm not sure if that's a bug or by design.

This patch now includes a regression test that simulates DFSClient closely.

IPCs done using block token-based tickets can't reuse connections
-

Attachments: hdfs-1965.txt

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1602) NameNode storage failed replica restoration is broken

2011-05-19 Thread Eli Collins (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-1602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-1602:
--

Summary: NameNode storage failed replica restoration is broken  (was: Fix 
HADOOP-4885 for it is doesn't work as expected.)

 NameNode storage failed replica restoration is broken
 -

 Key: HDFS-1602
 URL: https://issues.apache.org/jira/browse/HDFS-1602
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.21.0, 0.23.0
Reporter: Konstantin Boudnik
Assignee: Boris Shkolnik
 Fix For: 0.22.0

 Attachments: HDFS-1602-1.patch, HDFS-1602.patch, HDFS-1602v22.patch


 NameNode storage restore functionality doesn't work (as HDFS-903 
 demonstrated). This needs to be either disabled, or removed, or fixed. This 
 feature also fails HDFS-1496

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HDFS-1967) TestHDFSTrash failing on trunk and 22

TestHDFSTrash failing on trunk and 22
-

 Key: HDFS-1967
 URL: https://issues.apache.org/jira/browse/HDFS-1967
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Todd Lipcon
 Fix For: 0.22.0


Seems to have started failing recently in many commit builds as well as the 
last two nightly builds of 22:
https://builds.apache.org/hudson/job/Hadoop-Hdfs-22-branch/51/testReport/org.apache.hadoop.hdfs/TestHDFSTrash/testTrashEmptier/

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-1963) HDFS rpm integration project