[jira] [Created] (HDFS-1988) TestTrash infinite loops if run on a home directory with stuff in .Trash

2011-05-24 Thread Matt Foley (JIRA)
TestTrash infinite loops if run on a home directory with stuff in .Trash


 Key: HDFS-1988
 URL: https://issues.apache.org/jira/browse/HDFS-1988
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 0.22.0
Reporter: Matt Foley
 Fix For: 0.22.0


Seems to have started failing recently in many commit builds as well as the 
last two nightly builds of 22:
https://builds.apache.org/hudson/job/Hadoop-Hdfs-22-branch/51/testReport/org.apache.hadoop.hdfs/TestHDFSTrash/testTrashEmptier/

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1988) TestTrash infinite loops if run on a home directory with stuff in .Trash

2011-05-24 Thread Matt Foley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Foley updated HDFS-1988:
-

Issue Type: Bug  (was: Sub-task)
Parent: (was: HDFS-1852)

 TestTrash infinite loops if run on a home directory with stuff in .Trash
 

 Key: HDFS-1988
 URL: https://issues.apache.org/jira/browse/HDFS-1988
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 0.22.0
Reporter: Matt Foley
 Fix For: 0.22.0


 Seems to have started failing recently in many commit builds as well as the 
 last two nightly builds of 22:
 https://builds.apache.org/hudson/job/Hadoop-Hdfs-22-branch/51/testReport/org.apache.hadoop.hdfs/TestHDFSTrash/testTrashEmptier/

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-1989) When checkpointing by backup node occurs parallely when a file is being closed by a client then Exception occurs saying no journal streams.

2011-05-24 Thread ramkrishna.s.vasudevan (JIRA)
When checkpointing by backup node occurs parallely when a file is being closed 
by a client then Exception occurs saying no journal streams. 


 Key: HDFS-1989
 URL: https://issues.apache.org/jira/browse/HDFS-1989
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.23.0


Backup namenode initiates the checkpointing process. 
As a part of checkpointing based on the timestamp it tries to download the 
FSImage or use the existing one.

Then it tries to save the FSImage.

During this time it tries to close the editLog streams.

Parallely when a client tries to close a file just after the checkpointing 
process closes the editLog Stream then we get an exception saying
java.io.IOException: java.lang.IllegalStateException: !!! WARNING !!! File 
system changes are not persistent. No journal streams.

Here the saveNameSpace api closes all the editlog streams resulting in this 
issue.



 


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1989) When checkpointing by backup node occurs parallely when a file is being closed by a client then Exception occurs saying no journal streams.

2011-05-24 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038518#comment-13038518
 ] 

ramkrishna.s.vasudevan commented on HDFS-1989:
--

Why does saveNameSpace api closes all the editLog Streams.

Ideally once checkpoint starts the streams have already been diverted to 
edit.new and the in saveCheckpoint we try to move the current directory 
contents to latestcheckpoint.tmp.

So in what scneario do we require the editLog.close() which results in closing 
of all edit streams?

 When checkpointing by backup node occurs parallely when a file is being 
 closed by a client then Exception occurs saying no journal streams. 
 

 Key: HDFS-1989
 URL: https://issues.apache.org/jira/browse/HDFS-1989
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.23.0


 Backup namenode initiates the checkpointing process. 
 As a part of checkpointing based on the timestamp it tries to download the 
 FSImage or use the existing one.
 Then it tries to save the FSImage.
 During this time it tries to close the editLog streams.
 Parallely when a client tries to close a file just after the checkpointing 
 process closes the editLog Stream then we get an exception saying
 java.io.IOException: java.lang.IllegalStateException: !!! WARNING !!! File 
 system changes are not persistent. No journal streams.
 Here the saveNameSpace api closes all the editlog streams resulting in this 
 issue.
  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1989) When checkpointing by backup node occurs parallely when a file is being closed by a client then Exception occurs saying no journal streams.

2011-05-24 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038519#comment-13038519
 ] 

ramkrishna.s.vasudevan commented on HDFS-1989:
--

of syncs: 21 SyncTimes(ms): 130 176 
2011-05-17 14:28:44,921 ERROR org.apache.hadoop.hdfs.server.namenode.FSEditLog: 
Unable to sync edit log.
java.io.IOException: java.lang.IllegalStateException: !!! WARNING !!! File 
system changes are not persistent. No journal streams.
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.logEdit(FSEditLog.java:1029)
at 
org.apache.hadoop.hdfs.server.namenode.BackupImage.journal(BackupImage.java:247)
at 
org.apache.hadoop.hdfs.server.namenode.BackupNode.journal(BackupNode.java:224)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:422)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1419)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1415)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1131)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1413)

at org.apache.hadoop.ipc.Client.call(Client.java:1052)
at 
org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:250)
at $Proxy8.journal(Unknown Source)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogBackupOutputStream.send(EditLogBackupOutputStream.java:181)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogBackupOutputStream.flushAndSync(EditLogBackupOutputStream.java:155)
at 
org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:84)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:515)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:1797)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.complete(NameNode.java:896)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:422)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1419)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1415)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1131)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1413)


 When checkpointing by backup node occurs parallely when a file is being 
 closed by a client then Exception occurs saying no journal streams. 
 

 Key: HDFS-1989
 URL: https://issues.apache.org/jira/browse/HDFS-1989
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.23.0


 Backup namenode initiates the checkpointing process. 
 As a part of checkpointing based on the timestamp it tries to download the 
 FSImage or use the existing one.
 Then it tries to save the FSImage.
 During this time it tries to close the editLog streams.
 Parallely when a client tries to close a file just after the checkpointing 
 process closes the editLog Stream then we get an exception saying
 java.io.IOException: java.lang.IllegalStateException: !!! WARNING !!! File 
 system changes are not persistent. No journal streams.
 Here the saveNameSpace api closes all the editlog streams resulting in this 
 issue.
  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-1990) Resource leaks in HDFS

2011-05-24 Thread ramkrishna.s.vasudevan (JIRA)
Resource leaks in HDFS
--

 Key: HDFS-1990
 URL: https://issues.apache.org/jira/browse/HDFS-1990
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node, name-node
Affects Versions: 0.23.0
Reporter: ramkrishna.s.vasudevan
Priority: Minor
 Fix For: 0.23.0


Possible resource leakage in HDFS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HDFS-1727) fsck command can display command usage if user passes any illegal argument

2011-05-24 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G reassigned HDFS-1727:
-

Assignee: (was: Uma Maheswara Rao G)

 fsck command can display command usage if user passes any illegal argument
 --

 Key: HDFS-1727
 URL: https://issues.apache.org/jira/browse/HDFS-1727
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Uma Maheswara Rao G
Priority: Minor

 In fsck command if user passes the arguments like
 ./hadoop fsck -test -files -blocks -racks
 In this case it will take / and will display whole DFS information regarding 
 to files,blocks,racks.
 But here, we are hiding the user mistake. Instead of this, we can display the 
 command usage if user passes any invalid argument like above.
 If user passes illegal optional arguments like
 ./hadoop fsck /test -listcorruptfileblocks instead of
 ./hadoop fsck /test -list-corruptfileblocks also we can display the proper 
 command usage

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1951) Null pointer exception comes when Namenode recovery happens and there is no response from client to NN more than the hardlimit for NN recovery and the current block is mor

2011-05-24 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HDFS-1951:
-

Attachment: HDFS-1951.patch

 Null pointer exception comes when Namenode recovery happens and there is no 
 response from client to NN more than the hardlimit for NN recovery and the 
 current block is more than the prev block size in NN 
 

 Key: HDFS-1951
 URL: https://issues.apache.org/jira/browse/HDFS-1951
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.20-append
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.20-append

 Attachments: HDFS-1951.patch


 Null pointer exception comes when Namenode recovery happens and there is no 
 response from client to NN more than the hardlimit for NN recovery and the 
 current block is more than the prev block size in NN 
 1. Write using a client to 2 datanodes
 2. Kill one data node and allow pipeline recovery.
 3. write somemore data to the same block
 4. Parallely allow the namenode recovery to happen
 Null pointer exception will come in addStoreBlock api.
  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1949) Number format Exception is displayed in Namenode UI when the chunk size field is blank or string value..

2011-05-24 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HDFS-1949:
-

Attachment: hdfs-1949-1.patch

 Number format Exception is displayed in Namenode UI when the chunk size field 
 is blank or string value.. 
 -

 Key: HDFS-1949
 URL: https://issues.apache.org/jira/browse/HDFS-1949
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.20-append, 0.21.0, 0.23.0
Reporter: ramkrishna.s.vasudevan
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1949.patch, hdfs-1949-1.patch, hdfs-1949.patch


 In the Namenode UI we have a text box to enter the chunk size.
 The expected value for the chunk size is a valid Integer value.
 If any invalid value, string or empty spaces are provided it throws number 
 format exception.
 The existing behaviour is like we need to consider the default value if no 
 value is specified.
 Soln
 
 We can handle numberformat exception and assign default value if invalid 
 value is specified.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1727) fsck command can display command usage if user passes any illegal argument

2011-05-24 Thread sravankorumilli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sravankorumilli updated HDFS-1727:
--

Attachment: HDFS-1727.patch

 fsck command can display command usage if user passes any illegal argument
 --

 Key: HDFS-1727
 URL: https://issues.apache.org/jira/browse/HDFS-1727
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Uma Maheswara Rao G
Priority: Minor
 Attachments: HDFS-1727.patch


 In fsck command if user passes the arguments like
 ./hadoop fsck -test -files -blocks -racks
 In this case it will take / and will display whole DFS information regarding 
 to files,blocks,racks.
 But here, we are hiding the user mistake. Instead of this, we can display the 
 command usage if user passes any invalid argument like above.
 If user passes illegal optional arguments like
 ./hadoop fsck /test -listcorruptfileblocks instead of
 ./hadoop fsck /test -list-corruptfileblocks also we can display the proper 
 command usage

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1805) Some Tests in TestDFSShell can not shutdown the MiniDFSCluster on any exception/assertion failure. This will leads to fail other testcases.

2011-05-24 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-1805:
--

Attachment: HDFS-1805-1.patch

 Some Tests in TestDFSShell can not shutdown the MiniDFSCluster on any 
 exception/assertion failure. This will leads to fail other testcases.
 ---

 Key: HDFS-1805
 URL: https://issues.apache.org/jira/browse/HDFS-1805
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G
Priority: Minor
 Attachments: HDFS-1805-1.patch, HDFS-1805.patch


 Some test cases in TestDFSShell are not shutting down the MiniDFSCluster in 
 finally.
 If any test assertion failure or exception can result in not shutting down 
 this cluster. Because of this other testcases will fail. This will create 
 difficulty in finding the actual testcase failures.
 So, better to shutdown the cluster in finally. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1949) Number format Exception is displayed in Namenode UI when the chunk size field is blank or string value..

2011-05-24 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HDFS-1949:
-

Status: Open  (was: Patch Available)

 Number format Exception is displayed in Namenode UI when the chunk size field 
 is blank or string value.. 
 -

 Key: HDFS-1949
 URL: https://issues.apache.org/jira/browse/HDFS-1949
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.21.0, 0.20-append, 0.23.0
Reporter: ramkrishna.s.vasudevan
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1949.patch, hdfs-1949-1.patch, hdfs-1949.patch


 In the Namenode UI we have a text box to enter the chunk size.
 The expected value for the chunk size is a valid Integer value.
 If any invalid value, string or empty spaces are provided it throws number 
 format exception.
 The existing behaviour is like we need to consider the default value if no 
 value is specified.
 Soln
 
 We can handle numberformat exception and assign default value if invalid 
 value is specified.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1949) Number format Exception is displayed in Namenode UI when the chunk size field is blank or string value..

2011-05-24 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HDFS-1949:
-

Status: Patch Available  (was: Open)

 Number format Exception is displayed in Namenode UI when the chunk size field 
 is blank or string value.. 
 -

 Key: HDFS-1949
 URL: https://issues.apache.org/jira/browse/HDFS-1949
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.21.0, 0.20-append, 0.23.0
Reporter: ramkrishna.s.vasudevan
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1949.patch, hdfs-1949-1.patch, hdfs-1949.patch


 In the Namenode UI we have a text box to enter the chunk size.
 The expected value for the chunk size is a valid Integer value.
 If any invalid value, string or empty spaces are provided it throws number 
 format exception.
 The existing behaviour is like we need to consider the default value if no 
 value is specified.
 Soln
 
 We can handle numberformat exception and assign default value if invalid 
 value is specified.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1805) Some Tests in TestDFSShell can not shutdown the MiniDFSCluster on any exception/assertion failure. This will leads to fail other testcases.

2011-05-24 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-1805:
--

Fix Version/s: 0.23.0
Affects Version/s: 0.23.0
   Status: Open  (was: Patch Available)

 Some Tests in TestDFSShell can not shutdown the MiniDFSCluster on any 
 exception/assertion failure. This will leads to fail other testcases.
 ---

 Key: HDFS-1805
 URL: https://issues.apache.org/jira/browse/HDFS-1805
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 0.23.0
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1805-1.patch, HDFS-1805.patch


 Some test cases in TestDFSShell are not shutting down the MiniDFSCluster in 
 finally.
 If any test assertion failure or exception can result in not shutting down 
 this cluster. Because of this other testcases will fail. This will create 
 difficulty in finding the actual testcase failures.
 So, better to shutdown the cluster in finally. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1805) Some Tests in TestDFSShell can not shutdown the MiniDFSCluster on any exception/assertion failure. This will leads to fail other testcases.

2011-05-24 Thread Uma Maheswara Rao G (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-1805:
--

Status: Patch Available  (was: Open)

 Some Tests in TestDFSShell can not shutdown the MiniDFSCluster on any 
 exception/assertion failure. This will leads to fail other testcases.
 ---

 Key: HDFS-1805
 URL: https://issues.apache.org/jira/browse/HDFS-1805
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 0.23.0
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1805-1.patch, HDFS-1805.patch


 Some test cases in TestDFSShell are not shutting down the MiniDFSCluster in 
 finally.
 If any test assertion failure or exception can result in not shutting down 
 this cluster. Because of this other testcases will fail. This will create 
 difficulty in finding the actual testcase failures.
 So, better to shutdown the cluster in finally. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1951) Null pointer exception comes when Namenode recovery happens and there is no response from client to NN more than the hardlimit for NN recovery and the current block is mor

2011-05-24 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HDFS-1951:
-

Status: Patch Available  (was: Open)

 Null pointer exception comes when Namenode recovery happens and there is no 
 response from client to NN more than the hardlimit for NN recovery and the 
 current block is more than the prev block size in NN 
 

 Key: HDFS-1951
 URL: https://issues.apache.org/jira/browse/HDFS-1951
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.20-append
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.20-append

 Attachments: HDFS-1951.patch


 Null pointer exception comes when Namenode recovery happens and there is no 
 response from client to NN more than the hardlimit for NN recovery and the 
 current block is more than the prev block size in NN 
 1. Write using a client to 2 datanodes
 2. Kill one data node and allow pipeline recovery.
 3. write somemore data to the same block
 4. Parallely allow the namenode recovery to happen
 Null pointer exception will come in addStoreBlock api.
  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1727) fsck command can display command usage if user passes any illegal argument

2011-05-24 Thread sravankorumilli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sravankorumilli updated HDFS-1727:
--

Affects Version/s: 0.23.0
   0.20.1
   Status: Patch Available  (was: Open)

 fsck command can display command usage if user passes any illegal argument
 --

 Key: HDFS-1727
 URL: https://issues.apache.org/jira/browse/HDFS-1727
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.20.1, 0.23.0
Reporter: Uma Maheswara Rao G
Priority: Minor
 Attachments: HDFS-1727.patch


 In fsck command if user passes the arguments like
 ./hadoop fsck -test -files -blocks -racks
 In this case it will take / and will display whole DFS information regarding 
 to files,blocks,racks.
 But here, we are hiding the user mistake. Instead of this, we can display the 
 command usage if user passes any invalid argument like above.
 If user passes illegal optional arguments like
 ./hadoop fsck /test -listcorruptfileblocks instead of
 ./hadoop fsck /test -list-corruptfileblocks also we can display the proper 
 command usage

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1951) Null pointer exception comes when Namenode recovery happens and there is no response from client to NN more than the hardlimit for NN recovery and the current block is m

2011-05-24 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038578#comment-13038578
 ] 

ramkrishna.s.vasudevan commented on HDFS-1951:
--

Applicable in 0.20-append branch.

 Null pointer exception comes when Namenode recovery happens and there is no 
 response from client to NN more than the hardlimit for NN recovery and the 
 current block is more than the prev block size in NN 
 

 Key: HDFS-1951
 URL: https://issues.apache.org/jira/browse/HDFS-1951
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.20-append
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.20-append

 Attachments: HDFS-1951.patch


 Null pointer exception comes when Namenode recovery happens and there is no 
 response from client to NN more than the hardlimit for NN recovery and the 
 current block is more than the prev block size in NN 
 1. Write using a client to 2 datanodes
 2. Kill one data node and allow pipeline recovery.
 3. write somemore data to the same block
 4. Parallely allow the namenode recovery to happen
 Null pointer exception will come in addStoreBlock api.
  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1805) Some Tests in TestDFSShell can not shutdown the MiniDFSCluster on any exception/assertion failure. This will leads to fail other testcases.

2011-05-24 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038579#comment-13038579
 ] 

Daryn Sharp commented on HDFS-1805:
---

Is it possible to start/stop the mini cluster in @BeforeClass/@AfterClass, and 
then delete the fs contents in @Before?  This should greatly reduce the runtime 
of the tests.

 Some Tests in TestDFSShell can not shutdown the MiniDFSCluster on any 
 exception/assertion failure. This will leads to fail other testcases.
 ---

 Key: HDFS-1805
 URL: https://issues.apache.org/jira/browse/HDFS-1805
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 0.23.0
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1805-1.patch, HDFS-1805.patch


 Some test cases in TestDFSShell are not shutting down the MiniDFSCluster in 
 finally.
 If any test assertion failure or exception can result in not shutting down 
 this cluster. Because of this other testcases will fail. This will create 
 difficulty in finding the actual testcase failures.
 So, better to shutdown the cluster in finally. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1869) mkdirs should use the supplied permission for all of the created directories

2011-05-24 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038585#comment-13038585
 ] 

Daryn Sharp commented on HDFS-1869:
---

A general question: do we favor the fs being posix compliant, or maintaining 
idiosyncrasies?

 mkdirs should use the supplied permission for all of the created directories
 

 Key: HDFS-1869
 URL: https://issues.apache.org/jira/browse/HDFS-1869
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Attachments: HDFS-1869-2.patch, HDFS-1869.patch


 Mkdirs only uses the supplied FsPermission for the last directory of the 
 path.  Paths 0..N-1 will all inherit the parent dir's permissions -even if- 
 inheritPermission is false.  This is a regression from somewhere around 
 0.20.9 and does not follow posix semantics.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1951) Null pointer exception comes when Namenode recovery happens and there is no response from client to NN more than the hardlimit for NN recovery and the current block is m

2011-05-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038592#comment-13038592
 ] 

Hadoop QA commented on HDFS-1951:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12480247/HDFS-1951.patch
  against trunk revision 1126795.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 4 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/614//console

This message is automatically generated.

 Null pointer exception comes when Namenode recovery happens and there is no 
 response from client to NN more than the hardlimit for NN recovery and the 
 current block is more than the prev block size in NN 
 

 Key: HDFS-1951
 URL: https://issues.apache.org/jira/browse/HDFS-1951
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.20-append
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.20-append

 Attachments: HDFS-1951.patch


 Null pointer exception comes when Namenode recovery happens and there is no 
 response from client to NN more than the hardlimit for NN recovery and the 
 current block is more than the prev block size in NN 
 1. Write using a client to 2 datanodes
 2. Kill one data node and allow pipeline recovery.
 3. write somemore data to the same block
 4. Parallely allow the namenode recovery to happen
 Null pointer exception will come in addStoreBlock api.
  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1620) FSConstants vs FsConstants

2011-05-24 Thread Harsh J Chouraria (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated HDFS-1620:


Attachment: HDFS-1620.r1.diff

Patch that refactors the FSConstants to DFSConstants, and then Deprecates the 
original with a doc link to the newer interface.

 FSConstants vs FsConstants 
 ---

 Key: HDFS-1620
 URL: https://issues.apache.org/jira/browse/HDFS-1620
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Harsh J Chouraria
Priority: Minor
 Attachments: HDFS-1620.r1.diff


 We have {{org.apache.hadoop.hdfs.protocol.*FSConstants*}} and 
 {{org.apache.hadoop.fs.*FsConstants*}}.  Elegant or confused?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1620) FSConstants vs FsConstants

2011-05-24 Thread Harsh J Chouraria (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J Chouraria updated HDFS-1620:


 Tags: fsconstants, dfsconstants, name changes, refactor, rename
Fix Version/s: 0.23.0
Affects Version/s: 0.22.0
 Release Note: Rename HDFS's FSConstants interface into DFSConstants.
   Status: Patch Available  (was: Open)

 FSConstants vs FsConstants 
 ---

 Key: HDFS-1620
 URL: https://issues.apache.org/jira/browse/HDFS-1620
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.22.0
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Harsh J Chouraria
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1620.r1.diff


 We have {{org.apache.hadoop.hdfs.protocol.*FSConstants*}} and 
 {{org.apache.hadoop.fs.*FsConstants*}}.  Elegant or confused?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1727) fsck command can display command usage if user passes any illegal argument

2011-05-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038634#comment-13038634
 ] 

Hadoop QA commented on HDFS-1727:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12480252/HDFS-1727.patch
  against trunk revision 1126795.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 5 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
  org.apache.hadoop.hdfs.TestFSOutputSummer

+1 contrib tests.  The patch passed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/617//testReport/
Findbugs warnings: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/617//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/617//console

This message is automatically generated.

 fsck command can display command usage if user passes any illegal argument
 --

 Key: HDFS-1727
 URL: https://issues.apache.org/jira/browse/HDFS-1727
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.20.1, 0.23.0
Reporter: Uma Maheswara Rao G
Priority: Minor
 Attachments: HDFS-1727.patch


 In fsck command if user passes the arguments like
 ./hadoop fsck -test -files -blocks -racks
 In this case it will take / and will display whole DFS information regarding 
 to files,blocks,racks.
 But here, we are hiding the user mistake. Instead of this, we can display the 
 command usage if user passes any invalid argument like above.
 If user passes illegal optional arguments like
 ./hadoop fsck /test -listcorruptfileblocks instead of
 ./hadoop fsck /test -list-corruptfileblocks also we can display the proper 
 command usage

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1805) Some Tests in TestDFSShell can not shutdown the MiniDFSCluster on any exception/assertion failure. This will leads to fail other testcases.

2011-05-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038640#comment-13038640
 ] 

Hadoop QA commented on HDFS-1805:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12480251/HDFS-1805-1.patch
  against trunk revision 1126795.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 55 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.hdfs.TestDFSShell
  org.apache.hadoop.hdfs.TestDFSStorageStateRecovery

+1 contrib tests.  The patch passed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/616//testReport/
Findbugs warnings: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/616//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/616//console

This message is automatically generated.

 Some Tests in TestDFSShell can not shutdown the MiniDFSCluster on any 
 exception/assertion failure. This will leads to fail other testcases.
 ---

 Key: HDFS-1805
 URL: https://issues.apache.org/jira/browse/HDFS-1805
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 0.23.0
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1805-1.patch, HDFS-1805.patch


 Some test cases in TestDFSShell are not shutting down the MiniDFSCluster in 
 finally.
 If any test assertion failure or exception can result in not shutting down 
 this cluster. Because of this other testcases will fail. This will create 
 difficulty in finding the actual testcase failures.
 So, better to shutdown the cluster in finally. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1620) FSConstants vs FsConstants

2011-05-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038660#comment-13038660
 ] 

Hadoop QA commented on HDFS-1620:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12480271/HDFS-1620.r1.diff
  against trunk revision 1126795.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 131 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.hdfs.TestDFSStorageStateRecovery

+1 contrib tests.  The patch passed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/618//testReport/
Findbugs warnings: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/618//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/618//console

This message is automatically generated.

 FSConstants vs FsConstants 
 ---

 Key: HDFS-1620
 URL: https://issues.apache.org/jira/browse/HDFS-1620
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.22.0
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Harsh J Chouraria
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1620.r1.diff


 We have {{org.apache.hadoop.hdfs.protocol.*FSConstants*}} and 
 {{org.apache.hadoop.fs.*FsConstants*}}.  Elegant or confused?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1990) Resource leaks in HDFS

2011-05-24 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038664#comment-13038664
 ] 

Aaron T. Myers commented on HDFS-1990:
--

Hi Ramkrishna, can you provide any more specific details about exactly what 
resources appear to be leaking under what circumstances? Thanks a lot.

 Resource leaks in HDFS
 --

 Key: HDFS-1990
 URL: https://issues.apache.org/jira/browse/HDFS-1990
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node, name-node
Affects Versions: 0.23.0
Reporter: ramkrishna.s.vasudevan
Priority: Minor
 Fix For: 0.23.0


 Possible resource leakage in HDFS.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1968) Enhance TestWriteRead to support File Append and Position Read

2011-05-24 Thread CW Chung (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

CW Chung updated HDFS-1968:
---

Attachment: TestWriteRead.patch

The patch support the following additional command line options:

[-append | -truncate]: if file already exists, truncate to 0 length before 
writing to the file. Default is append to file.

[-usePosRead | - useSeqRead]: use position read to read the file, or the 
default sequential read.

Note: Sorry for the large number of difference. The Eclipse Format did a number 
of cosmetic changes in the spacing. 
 

 Enhance TestWriteRead to support File Append and Position Read 
 ---

 Key: HDFS-1968
 URL: https://issues.apache.org/jira/browse/HDFS-1968
 Project: Hadoop HDFS
  Issue Type: Test
  Components: test
Affects Versions: 0.23.0
Reporter: CW Chung
Assignee: CW Chung
Priority: Minor
 Attachments: TestWriteRead.patch, TestWriteRead.patch


 Desirable to enhance TestWriteRead to support command line options to do: 
 (1) File Append  
 (2) Position Read (currently supporting sequential read).   

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1968) Enhance TestWriteRead to support File Append and Position Read

2011-05-24 Thread CW Chung (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

CW Chung updated HDFS-1968:
---

Attachment: TestWriteRead.patch

This time with the proper granting license to ASF.

 Enhance TestWriteRead to support File Append and Position Read 
 ---

 Key: HDFS-1968
 URL: https://issues.apache.org/jira/browse/HDFS-1968
 Project: Hadoop HDFS
  Issue Type: Test
  Components: test
Affects Versions: 0.23.0
Reporter: CW Chung
Assignee: CW Chung
Priority: Minor
 Attachments: TestWriteRead.patch, TestWriteRead.patch


 Desirable to enhance TestWriteRead to support command line options to do: 
 (1) File Append  
 (2) Position Read (currently supporting sequential read).   

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1949) Number format Exception is displayed in Namenode UI when the chunk size field is blank or string value..

2011-05-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038727#comment-13038727
 ] 

Hadoop QA commented on HDFS-1949:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12480246/hdfs-1949-1.patch
  against trunk revision 1126795.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
  org.apache.hadoop.hdfs.TestFileAppend4
  org.apache.hadoop.hdfs.TestLargeBlock
  org.apache.hadoop.hdfs.TestWriteConfigurationToDFS

+1 contrib tests.  The patch passed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/615//testReport/
Findbugs warnings: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/615//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/615//console

This message is automatically generated.

 Number format Exception is displayed in Namenode UI when the chunk size field 
 is blank or string value.. 
 -

 Key: HDFS-1949
 URL: https://issues.apache.org/jira/browse/HDFS-1949
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.20-append, 0.21.0, 0.23.0
Reporter: ramkrishna.s.vasudevan
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1949.patch, hdfs-1949-1.patch, hdfs-1949.patch


 In the Namenode UI we have a text box to enter the chunk size.
 The expected value for the chunk size is a valid Integer value.
 If any invalid value, string or empty spaces are provided it throws number 
 format exception.
 The existing behaviour is like we need to consider the default value if no 
 value is specified.
 Soln
 
 We can handle numberformat exception and assign default value if invalid 
 value is specified.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1969) Running rollback on new-version namenode destroys namespace

2011-05-24 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038765#comment-13038765
 ] 

Konstantin Shvachko commented on HDFS-1969:
---

I think originally test were written under the assumption that VERSION file is 
the same for all versions. It is not anymore, so I agree with you we should 
have special methods generating version file, somewhere in UpgradeUtilities.

I am not against Guava or the vote, I just think it is strange to introduce a 
new package dependency just for one line, which to me looks like a classic 
assert.

 Running rollback on new-version namenode destroys namespace
 ---

 Key: HDFS-1969
 URL: https://issues.apache.org/jira/browse/HDFS-1969
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Blocker
 Fix For: 0.22.0

 Attachments: hdfs-1969.txt, hdfs-1969.txt


 The following sequence leaves the namespace in an inconsistent/broken state:
 - format NN using 0.20 (or any prior release, probably)
 - run hdfs namenode -upgrade on 0.22. ^C the NN once it comes up.
 - run hdfs namenode -rollback on 0.22  (this should fail but doesn't!)
 This leaves the name directory in a state such that the version file claims 
 it's an 0.20 namespace, but the fsimage is in 0.22 format. It then crashes 
 when trying to start up.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1969) Running rollback on new-version namenode destroys namespace

2011-05-24 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038772#comment-13038772
 ] 

Todd Lipcon commented on HDFS-1969:
---

OK, I will add a copy of the routine to save the VERSION file in the test tree.

bq.  I just think it is strange to introduce a new package dependency just for 
one line

The dependency was already introduced a few weeks back in common (so HDFS picks 
it up transitively). I'm just using it in this patch.

 Running rollback on new-version namenode destroys namespace
 ---

 Key: HDFS-1969
 URL: https://issues.apache.org/jira/browse/HDFS-1969
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Blocker
 Fix For: 0.22.0

 Attachments: hdfs-1969.txt, hdfs-1969.txt


 The following sequence leaves the namespace in an inconsistent/broken state:
 - format NN using 0.20 (or any prior release, probably)
 - run hdfs namenode -upgrade on 0.22. ^C the NN once it comes up.
 - run hdfs namenode -rollback on 0.22  (this should fail but doesn't!)
 This leaves the name directory in a state such that the version file claims 
 it's an 0.20 namespace, but the fsimage is in 0.22 format. It then crashes 
 when trying to start up.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-1991) HDFS-1073: Some refactoring of 2NN to easier share code with BN and CN

2011-05-24 Thread Todd Lipcon (JIRA)
HDFS-1073: Some refactoring of 2NN to easier share code with BN and CN
--

 Key: HDFS-1991
 URL: https://issues.apache.org/jira/browse/HDFS-1991
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Todd Lipcon
Assignee: Todd Lipcon




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1991) HDFS-1073: Some refactoring of 2NN to easier share code with BN and CN

2011-05-24 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-1991:
--

  Component/s: name-node
  Description: 
Currently the Checkpointer class shares a lot of very similar code with the 
secondary namenode.

This JIRA is to do a little cleanup in SecondaryNameNode that will make it 
easier to share code with the BackupNode and CheckpointNode.
Affects Version/s: Edit log branch (HDFS-1073)
Fix Version/s: Edit log branch (HDFS-1073)

 HDFS-1073: Some refactoring of 2NN to easier share code with BN and CN
 --

 Key: HDFS-1991
 URL: https://issues.apache.org/jira/browse/HDFS-1991
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: Edit log branch (HDFS-1073)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: Edit log branch (HDFS-1073)


 Currently the Checkpointer class shares a lot of very similar code with the 
 secondary namenode.
 This JIRA is to do a little cleanup in SecondaryNameNode that will make it 
 easier to share code with the BackupNode and CheckpointNode.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1967) TestHDFSTrash failing on trunk and 22

2011-05-24 Thread Matt Foley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038816#comment-13038816
 ] 

Matt Foley commented on HDFS-1967:
--

Todd commented in another thread:

bq. It's still not clear why TestHDFSTrash was putting stuff in the local 
filesystem's .Trash folder. ... TestHDFSTrash is supposed to test HDFS, not 
local.

Completely agree.  If you can help me understand what TestHDFSTrash thinks it's 
doing, and whether it accomplishes it, I'd appreciate it.  I think it's trying 
to do exactly equivalent tests on Local FileSystem with TestTrash and HDFS 
FileSystem with TestHDFSTrash, but it doesn't seem to be succeeding.

 TestHDFSTrash failing on trunk and 22
 -

 Key: HDFS-1967
 URL: https://issues.apache.org/jira/browse/HDFS-1967
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 0.22.0
Reporter: Todd Lipcon
 Fix For: 0.22.0


 Seems to have started failing recently in many commit builds as well as the 
 last two nightly builds of 22:
 https://builds.apache.org/hudson/job/Hadoop-Hdfs-22-branch/51/testReport/org.apache.hadoop.hdfs/TestHDFSTrash/testTrashEmptier/

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-1992) Remove vestiges of NNStorageListener

2011-05-24 Thread Todd Lipcon (JIRA)
Remove vestiges of NNStorageListener


 Key: HDFS-1992
 URL: https://issues.apache.org/jira/browse/HDFS-1992
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Todd Lipcon
Assignee: Todd Lipcon




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1941) Remove -genclusterid from NameNode startup options

2011-05-24 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038829#comment-13038829
 ] 

Suresh Srinivas commented on HDFS-1941:
---

I reverted this change. This utility to generate cluster ID is useful for 
automated deployments.

 Remove -genclusterid from NameNode startup options
 --

 Key: HDFS-1941
 URL: https://issues.apache.org/jira/browse/HDFS-1941
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: Bharath Mundlapudi
Assignee: Bharath Mundlapudi
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1941-1.patch


 Currently, namenode -genclusterid is a helper utility to generate unique 
 clusterid. This option is useless once namenode -format automatically 
 generates the clusterid.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1992) Remove vestiges of NNStorageListener

2011-05-24 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-1992:
--

  Description: In the HDFS-1073 branch, as of HDFS-1926, the edit log's 
error state no longer needs to be coupled to the FSImage's error state. We left 
in some vestiges of the NNStorageListener interface to keep that JIRA simpler. 
This JIRA is to remove the final vestiges since it's no longer necessary and 
just makes the code harder to follow.
Affects Version/s: Edit log branch (HDFS-1073)
Fix Version/s: Edit log branch (HDFS-1073)

 Remove vestiges of NNStorageListener
 

 Key: HDFS-1992
 URL: https://issues.apache.org/jira/browse/HDFS-1992
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: Edit log branch (HDFS-1073)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: Edit log branch (HDFS-1073)


 In the HDFS-1073 branch, as of HDFS-1926, the edit log's error state no 
 longer needs to be coupled to the FSImage's error state. We left in some 
 vestiges of the NNStorageListener interface to keep that JIRA simpler. This 
 JIRA is to remove the final vestiges since it's no longer necessary and just 
 makes the code harder to follow.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1941) Remove -genclusterid from NameNode startup options

2011-05-24 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038833#comment-13038833
 ] 

Todd Lipcon commented on HDFS-1941:
---

can you reopen and re-resolve it so that the jira indicates it was deemed 
invalid?

 Remove -genclusterid from NameNode startup options
 --

 Key: HDFS-1941
 URL: https://issues.apache.org/jira/browse/HDFS-1941
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: Bharath Mundlapudi
Assignee: Bharath Mundlapudi
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1941-1.patch


 Currently, namenode -genclusterid is a helper utility to generate unique 
 clusterid. This option is useless once namenode -format automatically 
 generates the clusterid.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1992) Remove vestiges of NNStorageListener

2011-05-24 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-1992:
--

Attachment: hdfs-1992.txt

 Remove vestiges of NNStorageListener
 

 Key: HDFS-1992
 URL: https://issues.apache.org/jira/browse/HDFS-1992
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: Edit log branch (HDFS-1073)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: Edit log branch (HDFS-1073)

 Attachments: hdfs-1992.txt


 In the HDFS-1073 branch, as of HDFS-1926, the edit log's error state no 
 longer needs to be coupled to the FSImage's error state. We left in some 
 vestiges of the NNStorageListener interface to keep that JIRA simpler. This 
 JIRA is to remove the final vestiges since it's no longer necessary and just 
 makes the code harder to follow.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1905) Improve the usability of namenode -format

2011-05-24 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038838#comment-13038838
 ] 

Suresh Srinivas commented on HDFS-1905:
---

While looking at our automated deployments - which are failing due to this 
change, I thought of an issue with this change.

In this change, for non federation users we decide to namenode -format 
command should automatically generated cluster ID. The problem is for the 
secondary/backup/checkpointer, one must run namenode -format -clusterid 
cid, where cid is from the first namenode that is formatted. Without this, 
the second namenode would generate its own cluster ID and will not be part of 
the same cluster as that of primary.

I vote for reverting this change and move to namenode -format -cluster cid.

 Improve the usability of namenode -format 
 --

 Key: HDFS-1905
 URL: https://issues.apache.org/jira/browse/HDFS-1905
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: Bharath Mundlapudi
Assignee: Bharath Mundlapudi
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1905-1.patch, HDFS-1905-2.patch


 While setting up 0.23 version based cluster, i ran into this issue. When i 
 issue a format namenode command, which got changed in 23, it should let the 
 user know to how to use this command in case where complete options were not 
 specified.
 ./hdfs namenode -format
 I get the following error msg, still its not clear what and how user should 
 use this command.
 11/05/09 15:36:25 ERROR namenode.NameNode: 
 java.lang.IllegalArgumentException: Format must be provided with clusterid
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1483)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1623)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1689)
  
 The usability of this command can be improved.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (HDFS-1941) Remove -genclusterid from NameNode startup options

2011-05-24 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas reopened HDFS-1941:
---


 Remove -genclusterid from NameNode startup options
 --

 Key: HDFS-1941
 URL: https://issues.apache.org/jira/browse/HDFS-1941
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: Bharath Mundlapudi
Assignee: Bharath Mundlapudi
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1941-1.patch


 Currently, namenode -genclusterid is a helper utility to generate unique 
 clusterid. This option is useless once namenode -format automatically 
 generates the clusterid.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1905) Improve the usability of namenode -format

2011-05-24 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038843#comment-13038843
 ] 

Todd Lipcon commented on HDFS-1905:
---

IMO that is an issue with the secondary/backup/checkpointer -- you shouldn't 
have to format it at all. When it's started with an empty namespace, it should 
simply grab the clusterID/etc from the NN that it's configured to checkpoint 
from. This is how it has worked in the past with namespace ID, for example - 
why should cluster ID be treated differently?

 Improve the usability of namenode -format 
 --

 Key: HDFS-1905
 URL: https://issues.apache.org/jira/browse/HDFS-1905
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: Bharath Mundlapudi
Assignee: Bharath Mundlapudi
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1905-1.patch, HDFS-1905-2.patch


 While setting up 0.23 version based cluster, i ran into this issue. When i 
 issue a format namenode command, which got changed in 23, it should let the 
 user know to how to use this command in case where complete options were not 
 specified.
 ./hdfs namenode -format
 I get the following error msg, still its not clear what and how user should 
 use this command.
 11/05/09 15:36:25 ERROR namenode.NameNode: 
 java.lang.IllegalArgumentException: Format must be provided with clusterid
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1483)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1623)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1689)
  
 The usability of this command can be improved.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-1941) Remove -genclusterid from NameNode startup options

2011-05-24 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas resolved HDFS-1941.
---

Resolution: Invalid

 Remove -genclusterid from NameNode startup options
 --

 Key: HDFS-1941
 URL: https://issues.apache.org/jira/browse/HDFS-1941
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: Bharath Mundlapudi
Assignee: Bharath Mundlapudi
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1941-1.patch


 Currently, namenode -genclusterid is a helper utility to generate unique 
 clusterid. This option is useless once namenode -format automatically 
 generates the clusterid.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1941) Remove -genclusterid from NameNode startup options

2011-05-24 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038844#comment-13038844
 ] 

Suresh Srinivas commented on HDFS-1941:
---

Todd, I just commented about reverting the change. Please give couple of more 
minutes and it will be done.

 Remove -genclusterid from NameNode startup options
 --

 Key: HDFS-1941
 URL: https://issues.apache.org/jira/browse/HDFS-1941
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: Bharath Mundlapudi
Assignee: Bharath Mundlapudi
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1941-1.patch


 Currently, namenode -genclusterid is a helper utility to generate unique 
 clusterid. This option is useless once namenode -format automatically 
 generates the clusterid.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1941) Remove -genclusterid from NameNode startup options

2011-05-24 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038845#comment-13038845
 ] 

Hudson commented on HDFS-1941:
--

Integrated in Hadoop-Hdfs-trunk-Commit #681 (See 
[https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/681/])
Reverting the change r1125031 - HDFS-1941. Remove -genclusterid option from 
namenode command.

suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1127311
Files : 
* 
/hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
* 
/hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/server/common/HdfsConstants.java
* /hadoop/hdfs/trunk/CHANGES.txt


 Remove -genclusterid from NameNode startup options
 --

 Key: HDFS-1941
 URL: https://issues.apache.org/jira/browse/HDFS-1941
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: Bharath Mundlapudi
Assignee: Bharath Mundlapudi
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1941-1.patch


 Currently, namenode -genclusterid is a helper utility to generate unique 
 clusterid. This option is useless once namenode -format automatically 
 generates the clusterid.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1966) Encapsulate individual DataTransferProtocol op header

2011-05-24 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-1966:
-

Attachment: h1966_20110524.patch

h1966_20110524.patch: reverted some changes.

 Encapsulate individual DataTransferProtocol op header
 -

 Key: HDFS-1966
 URL: https://issues.apache.org/jira/browse/HDFS-1966
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: data-node, hdfs client
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h1966_20110519.patch, h1966_20110524.patch


 It will make a clear distinction between the variables used in the protocol 
 and the others.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1592) Datanode startup doesn't honor volumes.tolerated

2011-05-24 Thread Bharath Mundlapudi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharath Mundlapudi updated HDFS-1592:
-

Attachment: HDFS-1592-5.patch

Thanks the review, Eli and Jitendra. I am attaching a patch which incorporates 
your comments. 

 Datanode startup doesn't honor volumes.tolerated 
 -

 Key: HDFS-1592
 URL: https://issues.apache.org/jira/browse/HDFS-1592
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.20.204.0
Reporter: Bharath Mundlapudi
Assignee: Bharath Mundlapudi
 Fix For: 0.20.204.0, 0.23.0

 Attachments: HDFS-1592-1.patch, HDFS-1592-2.patch, HDFS-1592-3.patch, 
 HDFS-1592-4.patch, HDFS-1592-5.patch, HDFS-1592-rel20.patch


 Datanode startup doesn't honor volumes.tolerated for hadoop 20 version.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1592) Datanode startup doesn't honor volumes.tolerated

2011-05-24 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-1592:
---

Status: Open  (was: Patch Available)

 Datanode startup doesn't honor volumes.tolerated 
 -

 Key: HDFS-1592
 URL: https://issues.apache.org/jira/browse/HDFS-1592
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.20.204.0
Reporter: Bharath Mundlapudi
Assignee: Bharath Mundlapudi
 Fix For: 0.20.204.0, 0.23.0

 Attachments: HDFS-1592-1.patch, HDFS-1592-2.patch, HDFS-1592-3.patch, 
 HDFS-1592-4.patch, HDFS-1592-5.patch, HDFS-1592-rel20.patch


 Datanode startup doesn't honor volumes.tolerated for hadoop 20 version.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1592) Datanode startup doesn't honor volumes.tolerated

2011-05-24 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-1592:
---

Status: Patch Available  (was: Open)

 Datanode startup doesn't honor volumes.tolerated 
 -

 Key: HDFS-1592
 URL: https://issues.apache.org/jira/browse/HDFS-1592
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.20.204.0
Reporter: Bharath Mundlapudi
Assignee: Bharath Mundlapudi
 Fix For: 0.20.204.0, 0.23.0

 Attachments: HDFS-1592-1.patch, HDFS-1592-2.patch, HDFS-1592-3.patch, 
 HDFS-1592-4.patch, HDFS-1592-5.patch, HDFS-1592-rel20.patch


 Datanode startup doesn't honor volumes.tolerated for hadoop 20 version.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1984) HDFS-1073: Enable multiple checkpointers to run simultaneously

2011-05-24 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038870#comment-13038870
 ] 

Eli Collins commented on HDFS-1984:
---

+1  Nice tests.  Feel free to address the following in another change.

Can't these two threads in the test race? Imagine they would never in practice.
{noformat}
checkpointThread.start();
// Wait for the first checkpointer to get to where it should save its image.
delayer.waitForCall();
{noformat}

It should be rare that there's no MD5 file for an image, ie only happens when 
there's an image from a previous version, therefore would it make sense to warn 
in places like setVerificationHeaders where an MD5 file is not present? Would 
it make sense to establish the invariant that an MD5 is required?

Not your change, but would be less error prone if ErrorSimulation used eg an 
enum CORRUPT_IMG_XFER instead of 4.

 HDFS-1073: Enable multiple checkpointers to run simultaneously
 --

 Key: HDFS-1984
 URL: https://issues.apache.org/jira/browse/HDFS-1984
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: Edit log branch (HDFS-1073)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: Edit log branch (HDFS-1073)

 Attachments: hdfs-1984.txt


 One of the motivations of HDFS-1073 is that it decouples the checkpoint 
 process so that multiple checkpoints could be taken at the same time and not 
 interfere with each other.
 Currently on the 1073 branch this doesn't quite work right, since we have 
 some state and validation in FSImage that's tied to a single fsimage_N -- 
 thus if two 2NNs perform a checkpoint at different transaction IDs, only one 
 will succeed.
 As a stress test, we can run two 2NNs each configured with the 
 fs.checkpoint.interval set to 0 which causes them to continuously 
 checkpoint as fast as they can.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1987) HDFS-1073: Test for 2NN downloading image is not running

2011-05-24 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038874#comment-13038874
 ] 

Eli Collins commented on HDFS-1987:
---

+1  lgtm. nice find.

 HDFS-1073: Test for 2NN downloading image is not running
 

 Key: HDFS-1987
 URL: https://issues.apache.org/jira/browse/HDFS-1987
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: Edit log branch (HDFS-1073)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: Edit log branch (HDFS-1073)

 Attachments: hdfs-1987.txt


 TestCheckpoint.testSecondaryImageDownload was introduced at some point but 
 was never called from anywhere, so it wasn't actually running. This JIRA is 
 to fix it up to work on trunk and actually run as part of the test suite.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1992) Remove vestiges of NNStorageListener

2011-05-24 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038880#comment-13038880
 ] 

Eli Collins commented on HDFS-1992:
---

+1  lgtm

 Remove vestiges of NNStorageListener
 

 Key: HDFS-1992
 URL: https://issues.apache.org/jira/browse/HDFS-1992
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: Edit log branch (HDFS-1073)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: Edit log branch (HDFS-1073)

 Attachments: hdfs-1992.txt


 In the HDFS-1073 branch, as of HDFS-1926, the edit log's error state no 
 longer needs to be coupled to the FSImage's error state. We left in some 
 vestiges of the NNStorageListener interface to keep that JIRA simpler. This 
 JIRA is to remove the final vestiges since it's no longer necessary and just 
 makes the code harder to follow.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1905) Improve the usability of namenode -format

2011-05-24 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038882#comment-13038882
 ] 

Suresh Srinivas commented on HDFS-1905:
---

Some one here corrected me that format is not needed for these nodes. So we do 
not have to revert this patch.

 Improve the usability of namenode -format 
 --

 Key: HDFS-1905
 URL: https://issues.apache.org/jira/browse/HDFS-1905
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
Reporter: Bharath Mundlapudi
Assignee: Bharath Mundlapudi
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1905-1.patch, HDFS-1905-2.patch


 While setting up 0.23 version based cluster, i ran into this issue. When i 
 issue a format namenode command, which got changed in 23, it should let the 
 user know to how to use this command in case where complete options were not 
 specified.
 ./hdfs namenode -format
 I get the following error msg, still its not clear what and how user should 
 use this command.
 11/05/09 15:36:25 ERROR namenode.NameNode: 
 java.lang.IllegalArgumentException: Format must be provided with clusterid
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1483)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1623)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1689)
  
 The usability of this command can be improved.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1592) Datanode startup doesn't honor volumes.tolerated

2011-05-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038884#comment-13038884
 ] 

Hadoop QA commented on HDFS-1592:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12480337/HDFS-1592-5.patch
  against trunk revision 1127317.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 5 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.hdfs.TestDFSStorageStateRecovery

+1 contrib tests.  The patch passed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/619//testReport/
Findbugs warnings: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/619//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/619//console

This message is automatically generated.

 Datanode startup doesn't honor volumes.tolerated 
 -

 Key: HDFS-1592
 URL: https://issues.apache.org/jira/browse/HDFS-1592
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.20.204.0
Reporter: Bharath Mundlapudi
Assignee: Bharath Mundlapudi
 Fix For: 0.20.204.0, 0.23.0

 Attachments: HDFS-1592-1.patch, HDFS-1592-2.patch, HDFS-1592-3.patch, 
 HDFS-1592-4.patch, HDFS-1592-5.patch, HDFS-1592-rel20.patch


 Datanode startup doesn't honor volumes.tolerated for hadoop 20 version.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1967) TestHDFSTrash failing on trunk and 22

2011-05-24 Thread Matt Foley (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038886#comment-13038886
 ] 

Matt Foley commented on HDFS-1967:
--

Suresh suggested that when running TestHDFSTrash on HDFS, the unit test should 
clear the .Trash subdirectory before running.  This will help avoid issues with 
what is effectively a shared Trash directory, although simultaneous execution 
of two builds on the same server may still have problems.  (As pointed out in 
HADOOP-7326, we can't clear the Trash directory when running on Local, as this 
may be your personal Trash folder!)

 TestHDFSTrash failing on trunk and 22
 -

 Key: HDFS-1967
 URL: https://issues.apache.org/jira/browse/HDFS-1967
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 0.22.0
Reporter: Todd Lipcon
 Fix For: 0.22.0


 Seems to have started failing recently in many commit builds as well as the 
 last two nightly builds of 22:
 https://builds.apache.org/hudson/job/Hadoop-Hdfs-22-branch/51/testReport/org.apache.hadoop.hdfs/TestHDFSTrash/testTrashEmptier/

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1967) TestHDFSTrash failing on trunk and 22

2011-05-24 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038894#comment-13038894
 ] 

Todd Lipcon commented on HDFS-1967:
---

When running on HDFS, it should have its own MiniDFSCluster anyway -- which 
means that it would be very surprising to see the .Trash directory exist at 
all! Somehow TestHDFSTrash just isn't looking at HDFS, best I can tell.

Let's pretend these tests never existed, scrap them, and rewrite them

 TestHDFSTrash failing on trunk and 22
 -

 Key: HDFS-1967
 URL: https://issues.apache.org/jira/browse/HDFS-1967
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: 0.22.0
Reporter: Todd Lipcon
 Fix For: 0.22.0


 Seems to have started failing recently in many commit builds as well as the 
 last two nightly builds of 22:
 https://builds.apache.org/hudson/job/Hadoop-Hdfs-22-branch/51/testReport/org.apache.hadoop.hdfs/TestHDFSTrash/testTrashEmptier/

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1984) HDFS-1073: Enable multiple checkpointers to run simultaneously

2011-05-24 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038898#comment-13038898
 ] 

Todd Lipcon commented on HDFS-1984:
---

bq. Can't these two threads in the test race? Imagine they would never in 
practice.

It's OK - delayer.waitForCall will just sit there and wait until the checkpoint 
thread gets to the instrumented method. It works pretty well in the 
TestFileAppend4 tests.

bq. It should be rare that there's no MD5 file for an image, ie only happens 
when there's an image from a previous version, therefore would it make sense to 
warn in places like setVerificationHeaders where an MD5 file is not present

This same code path is also used for transferring edits. Though perhaps we can 
add some flag like requireMd5File. I'll make a note of that as a TODO.

bq. Not your change, but would be less error prone if ErrorSimulation used eg 
an enum CORRUPT_IMG_XFER instead of 4.
agreed

 HDFS-1073: Enable multiple checkpointers to run simultaneously
 --

 Key: HDFS-1984
 URL: https://issues.apache.org/jira/browse/HDFS-1984
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: Edit log branch (HDFS-1073)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: Edit log branch (HDFS-1073)

 Attachments: hdfs-1984.txt


 One of the motivations of HDFS-1073 is that it decouples the checkpoint 
 process so that multiple checkpoints could be taken at the same time and not 
 interfere with each other.
 Currently on the 1073 branch this doesn't quite work right, since we have 
 some state and validation in FSImage that's tied to a single fsimage_N -- 
 thus if two 2NNs perform a checkpoint at different transaction IDs, only one 
 will succeed.
 As a stress test, we can run two 2NNs each configured with the 
 fs.checkpoint.interval set to 0 which causes them to continuously 
 checkpoint as fast as they can.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1964) Incorrect HTML unescaping in DatanodeJspHelper.java

2011-05-24 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-1964:
-

Attachment: hdfs-1964-trunk.1.patch

Updated patch which adds a test which would have caught the errant escaping.

 Incorrect HTML unescaping in DatanodeJspHelper.java
 ---

 Key: HDFS-1964
 URL: https://issues.apache.org/jira/browse/HDFS-1964
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Fix For: 0.22.0, 0.23.0

 Attachments: hdfs-1964-trunk.0.patch, hdfs-1964-trunk.1.patch


 HDFS-1575 introduced some HTML unescaping of parameters so that viewing a 
 file would work for paths containing HTML-escaped characters, but in two of 
 the places did the unescaping either too early or too late.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-1984) HDFS-1073: Enable multiple checkpointers to run simultaneously

2011-05-24 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved HDFS-1984.
---

  Resolution: Fixed
Hadoop Flags: [Reviewed]

Committed to branch. Made a note of the TODO for requiring md5 files on image 
transfer.

 HDFS-1073: Enable multiple checkpointers to run simultaneously
 --

 Key: HDFS-1984
 URL: https://issues.apache.org/jira/browse/HDFS-1984
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: Edit log branch (HDFS-1073)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: Edit log branch (HDFS-1073)

 Attachments: hdfs-1984.txt


 One of the motivations of HDFS-1073 is that it decouples the checkpoint 
 process so that multiple checkpoints could be taken at the same time and not 
 interfere with each other.
 Currently on the 1073 branch this doesn't quite work right, since we have 
 some state and validation in FSImage that's tied to a single fsimage_N -- 
 thus if two 2NNs perform a checkpoint at different transaction IDs, only one 
 will succeed.
 As a stress test, we can run two 2NNs each configured with the 
 fs.checkpoint.interval set to 0 which causes them to continuously 
 checkpoint as fast as they can.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-1993) TestCheckpoint needs to clean up between cases

2011-05-24 Thread Todd Lipcon (JIRA)
TestCheckpoint needs to clean up between cases
--

 Key: HDFS-1993
 URL: https://issues.apache.org/jira/browse/HDFS-1993
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Todd Lipcon
Assignee: Todd Lipcon




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1993) TestCheckpoint needs to clean up between cases

2011-05-24 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-1993:
--

  Component/s: test
   name-node
  Description: TestCheckpoint currently relies on some test ordering in 
order to pass correctly. Instead it should clean itself up in a setUp() method.
Affects Version/s: Edit log branch (HDFS-1073)
Fix Version/s: Edit log branch (HDFS-1073)

 TestCheckpoint needs to clean up between cases
 --

 Key: HDFS-1993
 URL: https://issues.apache.org/jira/browse/HDFS-1993
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node, test
Affects Versions: Edit log branch (HDFS-1073)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: Edit log branch (HDFS-1073)


 TestCheckpoint currently relies on some test ordering in order to pass 
 correctly. Instead it should clean itself up in a setUp() method.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1993) TestCheckpoint needs to clean up between cases

2011-05-24 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-1993:
--

Attachment: hdfs-1993.txt

Pretty trivial patch to just tests. Will commit this under CTR policy on branch.

 TestCheckpoint needs to clean up between cases
 --

 Key: HDFS-1993
 URL: https://issues.apache.org/jira/browse/HDFS-1993
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node, test
Affects Versions: Edit log branch (HDFS-1073)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: Edit log branch (HDFS-1073)

 Attachments: hdfs-1993.txt


 TestCheckpoint currently relies on some test ordering in order to pass 
 correctly. Instead it should clean itself up in a setUp() method.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-1993) TestCheckpoint needs to clean up between cases

2011-05-24 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved HDFS-1993.
---

Resolution: Fixed

 TestCheckpoint needs to clean up between cases
 --

 Key: HDFS-1993
 URL: https://issues.apache.org/jira/browse/HDFS-1993
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node, test
Affects Versions: Edit log branch (HDFS-1073)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: Edit log branch (HDFS-1073)

 Attachments: hdfs-1993.txt


 TestCheckpoint currently relies on some test ordering in order to pass 
 correctly. Instead it should clean itself up in a setUp() method.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-1992) Remove vestiges of NNStorageListener

2011-05-24 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved HDFS-1992.
---

  Resolution: Fixed
Hadoop Flags: [Reviewed]

Committed to branch, thanks Eli.

 Remove vestiges of NNStorageListener
 

 Key: HDFS-1992
 URL: https://issues.apache.org/jira/browse/HDFS-1992
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: Edit log branch (HDFS-1073)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: Edit log branch (HDFS-1073)

 Attachments: hdfs-1992.txt


 In the HDFS-1073 branch, as of HDFS-1926, the edit log's error state no 
 longer needs to be coupled to the FSImage's error state. We left in some 
 vestiges of the NNStorageListener interface to keep that JIRA simpler. This 
 JIRA is to remove the final vestiges since it's no longer necessary and just 
 makes the code harder to follow.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-1994) Fix race conditions when running two rapidly checkpointing 2NNs

2011-05-24 Thread Todd Lipcon (JIRA)
Fix race conditions when running two rapidly checkpointing 2NNs
---

 Key: HDFS-1994
 URL: https://issues.apache.org/jira/browse/HDFS-1994
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Todd Lipcon
Assignee: Todd Lipcon




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1994) Fix race conditions when running two rapidly checkpointing 2NNs

2011-05-24 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-1994:
--

  Component/s: name-node
  Description: 
HDFS-1984 added the ability to run two secondary namenodes at the same time. 
However, there were two races I found when stress testing this (by running two 
NNs each checkpointing in a tight loop with no sleep):
1) the writing of the seen_txid file was not atomic, so it was at some points 
reading an empty file
2) it was possible for two checkpointers to try to take a checkpoint at the 
same transaction ID, which would cause the two image downloads to collide and 
fail
Affects Version/s: Edit log branch (HDFS-1073)
Fix Version/s: Edit log branch (HDFS-1073)

 Fix race conditions when running two rapidly checkpointing 2NNs
 ---

 Key: HDFS-1994
 URL: https://issues.apache.org/jira/browse/HDFS-1994
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: Edit log branch (HDFS-1073)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: Edit log branch (HDFS-1073)


 HDFS-1984 added the ability to run two secondary namenodes at the same time. 
 However, there were two races I found when stress testing this (by running 
 two NNs each checkpointing in a tight loop with no sleep):
 1) the writing of the seen_txid file was not atomic, so it was at some points 
 reading an empty file
 2) it was possible for two checkpointers to try to take a checkpoint at the 
 same transaction ID, which would cause the two image downloads to collide and 
 fail

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1993) TestCheckpoint needs to clean up between cases

2011-05-24 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038911#comment-13038911
 ] 

Eli Collins commented on HDFS-1993:
---

+1  lgtm

 TestCheckpoint needs to clean up between cases
 --

 Key: HDFS-1993
 URL: https://issues.apache.org/jira/browse/HDFS-1993
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node, test
Affects Versions: Edit log branch (HDFS-1073)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: Edit log branch (HDFS-1073)

 Attachments: hdfs-1993.txt


 TestCheckpoint currently relies on some test ordering in order to pass 
 correctly. Instead it should clean itself up in a setUp() method.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1994) Fix race conditions when running two rapidly checkpointing 2NNs

2011-05-24 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-1994:
--

Attachment: hdfs-1994.txt

Here's a preliminary patch. It needs some more unit tests that trigger the 
interleaving described above.

Doing the stress test described, it now runs a lot more stably, though there's 
still one unrelated issue in the 2NN (to be addressed later)

 Fix race conditions when running two rapidly checkpointing 2NNs
 ---

 Key: HDFS-1994
 URL: https://issues.apache.org/jira/browse/HDFS-1994
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: Edit log branch (HDFS-1073)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: Edit log branch (HDFS-1073)

 Attachments: hdfs-1994.txt


 HDFS-1984 added the ability to run two secondary namenodes at the same time. 
 However, there were two races I found when stress testing this (by running 
 two NNs each checkpointing in a tight loop with no sleep):
 1) the writing of the seen_txid file was not atomic, so it was at some points 
 reading an empty file
 2) it was possible for two checkpointers to try to take a checkpoint at the 
 same transaction ID, which would cause the two image downloads to collide and 
 fail

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1994) Fix race conditions when running two rapidly checkpointing 2NNs

2011-05-24 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038909#comment-13038909
 ] 

Todd Lipcon commented on HDFS-1994:
---

issue #1 is fairly straightforward to address - just need to write to a side 
file and rename into place

the issue #2 is a bit subtle. The interleaving is the following:

2NN A) calls rollEdits()
2NN B) calls rollEdits()
2NN A) calls getRemoteEditLogManifest()
2NN B) calls getRemoteEditLogManifest()

both of them then see the same manifest, and download all of the edits 
available. Therefore they merge to the same transaction.

This can be fixed one of two ways:
a) add a lock to the code that downloads an image, to make sure that only one 
image with a certain txid could be downloaded at a time
b) when the 2NN calls rollEdits, it should only try to checkpoint up to the 
transaction where the roll happened.

IMO we should actually do both: (a) as a sanity check, and (b) to ensure each 
checkpoint is unique, thus allowing us freedom later to do something like 
include the 2NN's hostname in the image itself

 Fix race conditions when running two rapidly checkpointing 2NNs
 ---

 Key: HDFS-1994
 URL: https://issues.apache.org/jira/browse/HDFS-1994
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: Edit log branch (HDFS-1073)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: Edit log branch (HDFS-1073)

 Attachments: hdfs-1994.txt


 HDFS-1984 added the ability to run two secondary namenodes at the same time. 
 However, there were two races I found when stress testing this (by running 
 two NNs each checkpointing in a tight loop with no sleep):
 1) the writing of the seen_txid file was not atomic, so it was at some points 
 reading an empty file
 2) it was possible for two checkpointers to try to take a checkpoint at the 
 same transaction ID, which would cause the two image downloads to collide and 
 fail

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1964) Incorrect HTML unescaping in DatanodeJspHelper.java

2011-05-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038933#comment-13038933
 ] 

Hadoop QA commented on HDFS-1964:
-

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12480345/hdfs-1964-trunk.1.patch
  against trunk revision 1127317.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 12 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.hdfs.TestDFSStorageStateRecovery

+1 contrib tests.  The patch passed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/620//testReport/
Findbugs warnings: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/620//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/620//console

This message is automatically generated.

 Incorrect HTML unescaping in DatanodeJspHelper.java
 ---

 Key: HDFS-1964
 URL: https://issues.apache.org/jira/browse/HDFS-1964
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Fix For: 0.22.0, 0.23.0

 Attachments: hdfs-1964-trunk.0.patch, hdfs-1964-trunk.1.patch


 HDFS-1575 introduced some HTML unescaping of parameters so that viewing a 
 file would work for paths containing HTML-escaped characters, but in two of 
 the places did the unescaping either too early or too late.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1568) Improve DataXceiver error logging

2011-05-24 Thread Joey Echeverria (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joey Echeverria updated HDFS-1568:
--

Attachment: HDFS-1568-4.patch

I incorporated the most recent feedback and rebased the patch on the latest 
trunk.

 Improve DataXceiver error logging
 -

 Key: HDFS-1568
 URL: https://issues.apache.org/jira/browse/HDFS-1568
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Joey Echeverria
Priority: Minor
  Labels: newbie
 Attachments: HDFS-1568-1.patch, HDFS-1568-3.patch, HDFS-1568-4.patch, 
 HDFS-1568-output-changes.patch


 In supporting customers we often see things like SocketTimeoutExceptions or 
 EOFExceptions coming from DataXceiver, but the logging isn't very good. For 
 example, if we get an IOE while setting up a connection to the downstream 
 mirror in writeBlock, the IP of the downstream mirror isn't logged on the DN 
 side.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1568) Improve DataXceiver error logging

2011-05-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038951#comment-13038951
 ] 

Hadoop QA commented on HDFS-1568:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12480359/HDFS-1568-4.patch
  against trunk revision 1127317.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:
  
org.apache.hadoop.hdfs.server.namenode.TestOverReplicatedBlocks
  org.apache.hadoop.hdfs.TestDFSStorageStateRecovery

+1 contrib tests.  The patch passed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/621//testReport/
Findbugs warnings: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/621//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/621//console

This message is automatically generated.

 Improve DataXceiver error logging
 -

 Key: HDFS-1568
 URL: https://issues.apache.org/jira/browse/HDFS-1568
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Affects Versions: 0.23.0
Reporter: Todd Lipcon
Assignee: Joey Echeverria
Priority: Minor
  Labels: newbie
 Attachments: HDFS-1568-1.patch, HDFS-1568-3.patch, HDFS-1568-4.patch, 
 HDFS-1568-output-changes.patch


 In supporting customers we often see things like SocketTimeoutExceptions or 
 EOFExceptions coming from DataXceiver, but the logging isn't very good. For 
 example, if we get an IOE while setting up a connection to the downstream 
 mirror in writeBlock, the IP of the downstream mirror isn't logged on the DN 
 side.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1949) Number format Exception is displayed in Namenode UI when the chunk size field is blank or string value..

2011-05-24 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038953#comment-13038953
 ] 

ramkrishna.s.vasudevan commented on HDFS-1949:
--

The failure in test case are not due to the patch submitted.



 Number format Exception is displayed in Namenode UI when the chunk size field 
 is blank or string value.. 
 -

 Key: HDFS-1949
 URL: https://issues.apache.org/jira/browse/HDFS-1949
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.20-append, 0.21.0, 0.23.0
Reporter: ramkrishna.s.vasudevan
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1949.patch, hdfs-1949-1.patch, hdfs-1949.patch


 In the Namenode UI we have a text box to enter the chunk size.
 The expected value for the chunk size is a valid Integer value.
 If any invalid value, string or empty spaces are provided it throws number 
 format exception.
 The existing behaviour is like we need to consider the default value if no 
 value is specified.
 Soln
 
 We can handle numberformat exception and assign default value if invalid 
 value is specified.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1981) When namenode goes down while checkpointing and if is started again subsequent Checkpointing is always failing

2011-05-24 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038957#comment-13038957
 ] 

ramkrishna.s.vasudevan commented on HDFS-1981:
--

Writing UT for this may be difficult to reproduce the scenario.

The steps that I followed to reproduce this issue are
1. Start namenode and backup namenode
2. Allow checkpointing to happen such that the edits.new file is 
created on the namenode.
3. At this point kill the NN and BNN.
4. Now start the NN and BNN.
5. When checkpointing starts again we will get the above exception.


The exact problem comes in the loadFSEdits() api in  FSImage.java

Here if the loadFSEdits() api returns 0 then 

if (fsImage.recoverTransitionRead(dataDirs, editsDirs, startOpt)) {
  fsImage.saveNamespace(true);
}

saveNamespace() will not be invoked.

Kindly correct me if you find any problems in this.



 When namenode goes down while checkpointing and if is started again 
 subsequent Checkpointing is always failing
 --

 Key: HDFS-1981
 URL: https://issues.apache.org/jira/browse/HDFS-1981
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0
 Environment: Linux
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.23.0


 This scenario is applicable in NN and BNN case.
 When the namenode goes down after creating the edits.new, on subsequent 
 restart the divertFileStreams will not happen to edits.new as the edits.new 
 file is already present and the size is zero.
 so on trying to saveCheckPoint an exception occurs 
 2011-05-23 16:38:57,476 WARN org.mortbay.log: /getimage: java.io.IOException: 
 GetImage failed. java.io.IOException: Namenode has an edit log with timestamp 
 of 2011-05-23 16:38:56 but new checkpoint was created using editlog  with 
 timestamp 2011-05-23 16:37:30. Checkpoint Aborted.
 This is a bug or is that the behaviour.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1727) fsck command can display command usage if user passes any illegal argument

2011-05-24 Thread sravankorumilli (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038960#comment-13038960
 ] 

sravankorumilli commented on HDFS-1727:
---

I have verified, these test failures are not related to this patch.

 fsck command can display command usage if user passes any illegal argument
 --

 Key: HDFS-1727
 URL: https://issues.apache.org/jira/browse/HDFS-1727
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.20.1, 0.23.0
Reporter: Uma Maheswara Rao G
Priority: Minor
 Attachments: HDFS-1727.patch


 In fsck command if user passes the arguments like
 ./hadoop fsck -test -files -blocks -racks
 In this case it will take / and will display whole DFS information regarding 
 to files,blocks,racks.
 But here, we are hiding the user mistake. Instead of this, we can display the 
 command usage if user passes any invalid argument like above.
 If user passes illegal optional arguments like
 ./hadoop fsck /test -listcorruptfileblocks instead of
 ./hadoop fsck /test -list-corruptfileblocks also we can display the proper 
 command usage

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-1995) Minor modification to both dfsclusterhealth and dfshealth pages for Web UI

2011-05-24 Thread Tanping Wang (JIRA)
Minor modification to both dfsclusterhealth and dfshealth pages for Web UI
--

 Key: HDFS-1995
 URL: https://issues.apache.org/jira/browse/HDFS-1995
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.23.0
Reporter: Tanping Wang
Assignee: Tanping Wang
Priority: Minor
 Fix For: 0.23.0


Four small modifications/fixes:
on dfshealthpage:
1) fix remaining% to be remaining / total ( it was mistaken as used / total)
on dfsclusterhealth page:
1) makes the table header 8em wide
2) fix the typo(inconsistency) Total Files and Blocks = Total Files and 
Directories
3) make the DFS Used = sum of block pool used space of every name space.  And 
change the label names accordingly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-1995) Minor modification to both dfsclusterhealth and dfshealth pages for Web UI

2011-05-24 Thread Tanping Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tanping Wang updated HDFS-1995:
---

Attachment: HDFS-1995.patch

 Minor modification to both dfsclusterhealth and dfshealth pages for Web UI
 --

 Key: HDFS-1995
 URL: https://issues.apache.org/jira/browse/HDFS-1995
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 0.23.0
Reporter: Tanping Wang
Assignee: Tanping Wang
Priority: Minor
 Fix For: 0.23.0

 Attachments: HDFS-1995.patch


 Four small modifications/fixes:
 on dfshealthpage:
 1) fix remaining% to be remaining / total ( it was mistaken as used / total)
 on dfsclusterhealth page:
 1) makes the table header 8em wide
 2) fix the typo(inconsistency) Total Files and Blocks = Total Files and 
 Directories
 3) make the DFS Used = sum of block pool used space of every name space.  And 
 change the label names accordingly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1964) Incorrect HTML unescaping in DatanodeJspHelper.java

2011-05-24 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038970#comment-13038970
 ] 

Todd Lipcon commented on HDFS-1964:
---

Hi Aaron. I committed the patch to trunk, but looks like the patch doesn't 
quite apply to 0.22. Mind uploading a patch for that branch? Thanks!

 Incorrect HTML unescaping in DatanodeJspHelper.java
 ---

 Key: HDFS-1964
 URL: https://issues.apache.org/jira/browse/HDFS-1964
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0, 0.23.0
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers
 Fix For: 0.22.0, 0.23.0

 Attachments: hdfs-1964-trunk.0.patch, hdfs-1964-trunk.1.patch


 HDFS-1575 introduced some HTML unescaping of parameters so that viewing a 
 file would work for paths containing HTML-escaped characters, but in two of 
 the places did the unescaping either too early or too late.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira