[jira] [Updated] (HDFS-1845) symlink comes up as directory after namenode restart
[ https://issues.apache.org/jira/browse/HDFS-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John George updated HDFS-1845: -- Attachment: HDFS-1845-2.patch Attaching Yahoo! specific patch for the bug. $ ant test-core -Dtestcase=TestCheckpoint .. .. checkfailure: BUILD SUCCESSFUL Total time: 37 seconds symlink comes up as directory after namenode restart Key: HDFS-1845 URL: https://issues.apache.org/jira/browse/HDFS-1845 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0, 0.23.0 Reporter: John George Assignee: John George Fix For: 0.22.0, 0.23.0 Attachments: HDFS-1845-2.patch, HDFS-1845-apache-2.patch, HDFS-1845-apache-3.patch, HDFS-1845-apache.patch, hdfs-1845-branch22-1.patch When a symlink is first created, it get added to EditLogs. When namenode is restarted, it reads from this editlog and represents a symlink correctly and saves this information to its image. If the namenode is restarted again, it reads its from this FSImage, but thinks that a symlink is a directory. This is because it uses Block[] blocks to determine if an INode is a directory, a file, or symlink. Since both a directory and a symlink has blocks as null, it thinks that a symlink is a directory. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HDFS-1475) Want a -d flag in hadoop dfs -ls : Do not expand directories
[ https://issues.apache.org/jira/browse/HDFS-1475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp reassigned HDFS-1475: - Assignee: Daryn Sharp Want a -d flag in hadoop dfs -ls : Do not expand directories Key: HDFS-1475 URL: https://issues.apache.org/jira/browse/HDFS-1475 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs client Affects Versions: 0.20.1 Environment: any Reporter: Greg Connor Assignee: Daryn Sharp Priority: Minor I would really love it if dfs -ls had a -d flag, like unix ls -d, which would list the directories matching the name or pattern but *not* their contents. Current behavior is to expand every matching dir and list its contents, which is awkward if I just want to see the matching dirs themselves (and their permissions). Worse, if a directory exists but is empty, -ls simply returns no output at all, which is unhelpful. So far we have used some ugly workarounds to this in various scripts, such as -ls /path/to |grep dir # wasteful, and problematic if dir is a substring of the path -stat /path/to/dir Exists # stat has no way to get back the full path, sadly -count /path/to/dir # works but is probably overkill. Really there is no reliable replacement for ls -d -- the above hacks will work but only for certain isolated contexts. (I'm not a java programmer, or else I would probably submit a patch for this, or make my own jar file to do this since I need it a lot.) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1822) Editlog opcodes overlap between 20 security and later releases
[ https://issues.apache.org/jira/browse/HDFS-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022177#comment-13022177 ] Suresh Srinivas commented on HDFS-1822: --- Doesn't seem like this keeps branch-specific hackery confined to the branch. It does. We no longer need code for conflicting opcodes in later releases. The new check that is being added is for version compatibility. Editlog opcodes overlap between 20 security and later releases -- Key: HDFS-1822 URL: https://issues.apache.org/jira/browse/HDFS-1822 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.21.0, 0.22.0, 0.23.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Priority: Blocker Fix For: 0.22.0, 0.23.0 Attachments: HDFS-1822.patch Same opcode are used for different operations between 0.20.security, 0.22 and 0.23. This results in failure to load editlogs on later release, especially during upgrades. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1850) DN should transmit absolute failed volume count rather than increments to the NN
[ https://issues.apache.org/jira/browse/HDFS-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-1850: -- Hadoop Flags: (was: [Incompatible change]) Good point. Flag removed. DN should transmit absolute failed volume count rather than increments to the NN Key: HDFS-1850 URL: https://issues.apache.org/jira/browse/HDFS-1850 Project: Hadoop HDFS Issue Type: Bug Components: data-node, name-node Reporter: Eli Collins Assignee: Eli Collins Fix For: 0.23.0 The API added in HDFS-811 for the DN to report volume failures to the NN is inc(DN). However the given sequence of events will result in the NN forgetting about reported failed volumes: # DN loses a volume and reports it # NN restarts # DN re-registers to the new NN A more robust interface would be to have the DN report the total number of volume failures to the NN each heart beat (the same way other volume state is transmitted). This will likely be an incompatible change since it requires changing the Datanode protocol. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1842) Cannot upgrade 0.20.203 to 0.21 with an editslog present
[ https://issues.apache.org/jira/browse/HDFS-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022193#comment-13022193 ] Suresh Srinivas commented on HDFS-1842: --- You just need to demand that the edits is empty. This was my first choice as well. My concern was it might throw this error after partially upgrading to 204 and might require rollback to go back to previous release, to save namespace. I looked at the code more closely and rollback is not required. There are couple of choices on how this can be done: # If editlog file size == 0 then treat it as there are no edits. I am reluctant to go this route. With new editlog changes, could editlog size be != 0, but still it has no file system operation entries? # While loading editlogs, wait for numEdits to go to 1 and then throw an error. This means the entire fsimage is loaded, before the error is thrown. If we are doing this we might as well go with the current patch. The opcode conversion code then only remains in 2xx release. Cannot upgrade 0.20.203 to 0.21 with an editslog present Key: HDFS-1842 URL: https://issues.apache.org/jira/browse/HDFS-1842 Project: Hadoop HDFS Issue Type: Sub-task Components: name-node Affects Versions: 0.20.203.0 Reporter: Allen Wittenauer Priority: Blocker Attachments: HDFS-1842.rel203.patch, HDFS-1842.rel204.patch If a user installs 0.20.203 and then upgrades to 0.21 with an editslog present, 0.21 will corrupt the file system due to opcode re-usage. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1840) Terminate LeaseChecker when all writing files are closed.
[ https://issues.apache.org/jira/browse/HDFS-1840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1309#comment-1309 ] Suresh Srinivas commented on HDFS-1840: --- +1 for the patch. Terminate LeaseChecker when all writing files are closed. - Key: HDFS-1840 URL: https://issues.apache.org/jira/browse/HDFS-1840 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs client Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h1840_20110418.patch, h1840_20110419.patch, h1840_20110419b.patch In {{DFSClient}}, when there are files opened for write, a {{LeaseChecker}} thread is started for updating the leases periodically. However, it never terminates when when all writing files are closed. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1840) Terminate LeaseChecker when all writing files are closed.
[ https://issues.apache.org/jira/browse/HDFS-1840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-1840: - Resolution: Fixed Fix Version/s: 0.23.0 Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) I have committed this. Terminate LeaseChecker when all writing files are closed. - Key: HDFS-1840 URL: https://issues.apache.org/jira/browse/HDFS-1840 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs client Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.23.0 Attachments: h1840_20110418.patch, h1840_20110419.patch, h1840_20110419b.patch In {{DFSClient}}, when there are files opened for write, a {{LeaseChecker}} thread is started for updating the leases periodically. However, it never terminates when when all writing files are closed. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1562) Add rack policy tests
[ https://issues.apache.org/jira/browse/HDFS-1562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022261#comment-13022261 ] Matt Foley commented on HDFS-1562: -- Hi Eli, didn't realize you were going to look at TestDatanodeBlockScanner too. I did extensive mods to it as part of HDFS-1295. Let's take a look and compare. Add rack policy tests - Key: HDFS-1562 URL: https://issues.apache.org/jira/browse/HDFS-1562 Project: Hadoop HDFS Issue Type: Test Components: name-node, test Affects Versions: 0.23.0 Reporter: Eli Collins Assignee: Eli Collins Attachments: hdfs-1562-1.patch, hdfs-1562-2.patch, hdfs-1562-3.patch The existing replication tests (TestBlocksWithNotEnoughRacks, TestPendingReplication, TestOverReplicatedBlocks, TestReplicationPolicy, TestUnderReplicatedBlocks, and TestReplication) are missing tests for rack policy violations. This jira adds the following tests which I created when generating a new patch for HDFS-15. * Test that blocks that have a sufficient number of total replicas, but are not replicated cross rack, get replicated cross rack when a rack becomes available. * Test that new blocks for an underreplicated file will get replicated cross rack. * Mark a block as corrupt, test that when it is re-replicated that it is still replicated across racks. * Reduce the replication factor of a file, making sure that the only block that is across racks is not removed when deleting replicas. * Test that when a block is replicated because a replica is lost due to host failure the the rack policy is preserved. * Test that when the execss replicas of a block are reduced due to a node re-joining the cluster the rack policy is not violated. * Test that rack policy is still respected when blocks are replicated due to node decommissioning. * Test that rack policy is still respected when blocks are replicated due to node decommissioning, even when the blocks are over-replicated. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-148) timeout when writing dfs file causes infinite loop when closing the file
[ https://issues.apache.org/jira/browse/HDFS-148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022266#comment-13022266 ] John Meagher commented on HDFS-148: --- It looks like this was fixed for HDFS-278 timeout when writing dfs file causes infinite loop when closing the file Key: HDFS-148 URL: https://issues.apache.org/jira/browse/HDFS-148 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.20.2 Reporter: Nigel Daley Assignee: Sameer Paranjpye Priority: Critical If, when writing to a dfs file, I get a timeout exception: 06/11/29 11:16:05 WARN fs.DFSClient: Error while writing. java.net.SocketTimeoutException: timed out waiting for rpc response at org.apache.hadoop.ipc.Client.call(Client.java:469) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:164) at org.apache.hadoop.dfs.$Proxy0.reportWrittenBlock(Unknown Source) at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.internalClose(DFSClient.java:1220) at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.endBlock(DFSClient.java:1175) at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.flush(DFSClient.java:1121) at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.write(DFSClient.java:1103) at org.apache.hadoop.examples.NNBench2.createWrite(NNBench2.java:107) at org.apache.hadoop.examples.NNBench2.main(NNBench2.java:247) then the close() operation on the file appears to go into an infinite loop of retrying: 06/11/29 13:11:19 INFO fs.DFSClient: Could not complete file, retrying... 06/11/29 13:11:20 INFO fs.DFSClient: Could not complete file, retrying... 06/11/29 13:11:21 INFO fs.DFSClient: Could not complete file, retrying... 06/11/29 13:11:23 INFO fs.DFSClient: Could not complete file, retrying... 06/11/29 13:11:24 INFO fs.DFSClient: Could not complete file, retrying... ... -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1788) FsShell ls: Show symlinks properties
[ https://issues.apache.org/jira/browse/HDFS-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022362#comment-13022362 ] John George commented on HDFS-1788: --- To me it looks like this bug has two parts: 1. The ability for FsShell to show a symlink as l with the link target link - target just like in Linux. This seems to be a straight forward change in FsShell. 2. Inorder for FsShell to be able to do this, it needs to know that it is dealing with a symlink. As of now, it looks like FsShell uses FileSystem to check if a given path is a symlink or not. FileSystem class does not entirely support symlink. So, inorder to fix this bug, ls (FsShell) should either a) start using FileContext (HADOOP-6424) or b) FileSystem should be fixed to be able to deal with symlink. Inorder for FileSystem to support symlink, it should either be able to implement getFileLinkStatus() or getFileStatus() should itself be able to handle symlinks. The fastest/easiest way seems like getting getFileStatus() to also return the FileStatus of links. The best solution (but not the fastest) though seems to be to let FsShell use FileContext. Would it even make sense to let getFileStatus() return the status of symlinks as well (incase where the underlying filesystem supports symlinks) so that ls or any other command that uses FileSystem (as of today) can also deal with symlinks? Comments and suggestions welcome. FsShell ls: Show symlinks properties Key: HDFS-1788 URL: https://issues.apache.org/jira/browse/HDFS-1788 Project: Hadoop HDFS Issue Type: Improvement Components: tools Reporter: Jonathan Eagles Assignee: John George Priority: Minor ls FsShell command implementation has been consistent with the linux implementations of ls \-l. With the addition of symlinks, I would expect the ability to show file type 'd' for directory, '\-' for file, and 'l' for symlink. In addition, following the linkname entry for symlinks, I would expect the ability to show \- link target. In linux, the default is to the the properties of the link and not of the link target. In linux, '-L' option allows for the dereferencing of symlinks to show link target properties, but it is not the default. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1842) Cannot upgrade 0.20.203 to 0.21 with an editslog present
[ https://issues.apache.org/jira/browse/HDFS-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022365#comment-13022365 ] Konstantin Shvachko commented on HDFS-1842: --- Yes rollback is not needed as image and edits loading does not change any files or directories. Both choices work work me. (1) provides faster failure. Only the size of the empty edits is sizeof(long), which is the layoutVersion size. (2) is also fine and would be my preference. Informed admins will do saveNamespace() before upgrading, so the edits will be empty. But if they forget the upgrade will fail after 10 minutes, which is o(the time to restart the name-node with 203 and then again upgrade). Cannot upgrade 0.20.203 to 0.21 with an editslog present Key: HDFS-1842 URL: https://issues.apache.org/jira/browse/HDFS-1842 Project: Hadoop HDFS Issue Type: Sub-task Components: name-node Affects Versions: 0.20.203.0 Reporter: Allen Wittenauer Priority: Blocker Attachments: HDFS-1842.rel203.patch, HDFS-1842.rel204.patch If a user installs 0.20.203 and then upgrades to 0.21 with an editslog present, 0.21 will corrupt the file system due to opcode re-usage. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1842) Cannot upgrade 0.20.203 to 0.21 with an editslog present
[ https://issues.apache.org/jira/browse/HDFS-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022370#comment-13022370 ] Suresh Srinivas commented on HDFS-1842: --- OK I will go with (2) then. Cannot upgrade 0.20.203 to 0.21 with an editslog present Key: HDFS-1842 URL: https://issues.apache.org/jira/browse/HDFS-1842 Project: Hadoop HDFS Issue Type: Sub-task Components: name-node Affects Versions: 0.20.203.0 Reporter: Allen Wittenauer Priority: Blocker Attachments: HDFS-1842.rel203.patch, HDFS-1842.rel204.patch If a user installs 0.20.203 and then upgrades to 0.21 with an editslog present, 0.21 will corrupt the file system due to opcode re-usage. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1052) HDFS scalability with multiple namenodes
[ https://issues.apache.org/jira/browse/HDFS-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-1052: -- Attachment: (was: HDFS-1052.patch) HDFS scalability with multiple namenodes Key: HDFS-1052 URL: https://issues.apache.org/jira/browse/HDFS-1052 Project: Hadoop HDFS Issue Type: New Feature Components: name-node Affects Versions: 0.22.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: Block pool proposal.pdf, HDFS-1052.patch, Mulitple Namespaces5.pdf, high-level-design.pdf HDFS currently uses a single namenode that limits scalability of the cluster. This jira proposes an architecture to scale the nameservice horizontally using multiple namenodes. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1052) HDFS scalability with multiple namenodes
[ https://issues.apache.org/jira/browse/HDFS-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-1052: -- Attachment: HDFS-1052.patch Latest patch. HDFS scalability with multiple namenodes Key: HDFS-1052 URL: https://issues.apache.org/jira/browse/HDFS-1052 Project: Hadoop HDFS Issue Type: New Feature Components: name-node Affects Versions: 0.22.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: Block pool proposal.pdf, HDFS-1052.patch, Mulitple Namespaces5.pdf, high-level-design.pdf HDFS currently uses a single namenode that limits scalability of the cluster. This jira proposes an architecture to scale the nameservice horizontally using multiple namenodes. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-1843) Discover file not found early for file append
[ https://issues.apache.org/jira/browse/HDFS-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharath Mundlapudi updated HDFS-1843: - Attachment: HDFS-1843-2.patch Thanks for code review, Jitendra. I have incorporated the changes. Discover file not found early for file append -- Key: HDFS-1843 URL: https://issues.apache.org/jira/browse/HDFS-1843 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.23.0 Reporter: Bharath Mundlapudi Assignee: Bharath Mundlapudi Priority: Minor Fix For: 0.23.0 Attachments: HDFS-1843-1.patch, HDFS-1843-2.patch For the append call, discover file not found exception early and avoid extra server call. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1788) FsShell ls: Show symlinks properties
[ https://issues.apache.org/jira/browse/HDFS-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022404#comment-13022404 ] Eli Collins commented on HDFS-1788: --- I think it makes sense to move FsShell over to FileContext (HADOOP-6424). That's substantially less work than supporting symlinks in FileSystem and work we need to do anyway. FsShell ls: Show symlinks properties Key: HDFS-1788 URL: https://issues.apache.org/jira/browse/HDFS-1788 Project: Hadoop HDFS Issue Type: Improvement Components: tools Reporter: Jonathan Eagles Assignee: John George Priority: Minor ls FsShell command implementation has been consistent with the linux implementations of ls \-l. With the addition of symlinks, I would expect the ability to show file type 'd' for directory, '\-' for file, and 'l' for symlink. In addition, following the linkname entry for symlinks, I would expect the ability to show \- link target. In linux, the default is to the the properties of the link and not of the link target. In linux, '-L' option allows for the dereferencing of symlinks to show link target properties, but it is not the default. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1845) symlink comes up as directory after namenode restart
[ https://issues.apache.org/jira/browse/HDFS-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022406#comment-13022406 ] Eli Collins commented on HDFS-1845: --- bq. Attaching Yahoo! specific patch for the bug. Do you mean for branch-0.20-security? I think HDFS-1845-2.patch is equivalent to hdfs-1845-branch22-1.patch. symlink comes up as directory after namenode restart Key: HDFS-1845 URL: https://issues.apache.org/jira/browse/HDFS-1845 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0, 0.23.0 Reporter: John George Assignee: John George Fix For: 0.22.0, 0.23.0 Attachments: HDFS-1845-2.patch, HDFS-1845-apache-2.patch, HDFS-1845-apache-3.patch, HDFS-1845-apache.patch, hdfs-1845-branch22-1.patch When a symlink is first created, it get added to EditLogs. When namenode is restarted, it reads from this editlog and represents a symlink correctly and saves this information to its image. If the namenode is restarted again, it reads its from this FSImage, but thinks that a symlink is a directory. This is because it uses Block[] blocks to determine if an INode is a directory, a file, or symlink. Since both a directory and a symlink has blocks as null, it thinks that a symlink is a directory. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1052) HDFS scalability with multiple namenodes
[ https://issues.apache.org/jira/browse/HDFS-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022409#comment-13022409 ] Hadoop QA commented on HDFS-1052: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12476941/HDFS-1052.patch against trunk revision 1095461. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 322 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The patch appears to cause tar ant target to fail. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: -1 contrib tests. The patch failed contrib unit tests. -1 system test framework. The patch failed system test framework compile. Test results: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/393//testReport/ Console output: https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/393//console This message is automatically generated. HDFS scalability with multiple namenodes Key: HDFS-1052 URL: https://issues.apache.org/jira/browse/HDFS-1052 Project: Hadoop HDFS Issue Type: New Feature Components: name-node Affects Versions: 0.22.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: Block pool proposal.pdf, HDFS-1052.patch, Mulitple Namespaces5.pdf, high-level-design.pdf HDFS currently uses a single namenode that limits scalability of the cluster. This jira proposes an architecture to scale the nameservice horizontally using multiple namenodes. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1295) Improve namenode restart times by short-circuiting the first block reports from datanodes
[ https://issues.apache.org/jira/browse/HDFS-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022414#comment-13022414 ] Suresh Srinivas commented on HDFS-1295: --- Comments: # Minor: TestDatanodeBlockScanner - could you LOG.info or LOG.debug instead of System.out # Is it worth retaining the printDatanodeAssignments() and printDatanodeBlockReports(), which probably was added as debug code? # In the test we have TIMEOUT to be 20s. Is it reasonably long enough so that tests do not fail? # In block report time, why is the report creation time not included in metrics? Improve namenode restart times by short-circuiting the first block reports from datanodes - Key: HDFS-1295 URL: https://issues.apache.org/jira/browse/HDFS-1295 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: dhruba borthakur Assignee: Matt Foley Fix For: 0.23.0 Attachments: IBR_shortcut_v2a.patch, IBR_shortcut_v3atrunk.patch, IBR_shortcut_v4atrunk.patch, IBR_shortcut_v4atrunk.patch, IBR_shortcut_v4atrunk.patch, IBR_shortcut_v6atrunk.patch, shortCircuitBlockReport_1.txt The namenode restart is dominated by the performance of processing block reports. On a 2000 node cluster with 90 million blocks, block report processing takes 30 to 40 minutes. The namenode diffs the contents of the incoming block report with the contents of the blocks map, and then applies these diffs to the blocksMap, but in reality there is no need to compute the diff because this is the first block report from the datanode. This code change improves block report processing time by 300%. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1463) accessTime updates should not occur in safeMode
[ https://issues.apache.org/jira/browse/HDFS-1463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022424#comment-13022424 ] Hudson commented on HDFS-1463: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) accessTime updates should not occur in safeMode --- Key: HDFS-1463 URL: https://issues.apache.org/jira/browse/HDFS-1463 Project: Hadoop HDFS Issue Type: Bug Components: name-node Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.23.0 Attachments: accessTimeSafeMode.txt, accessTimeSafeMode.txt FSNamesystem.getBlockLocations sometimes need to update the accessTime of files. If the namenode is in safemode, this call should fail. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1360) TestBlockRecovery should bind ephemeral ports
[ https://issues.apache.org/jira/browse/HDFS-1360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022425#comment-13022425 ] Hudson commented on HDFS-1360: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) TestBlockRecovery should bind ephemeral ports - Key: HDFS-1360 URL: https://issues.apache.org/jira/browse/HDFS-1360 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.21.0, 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Minor Fix For: 0.23.0 Attachments: hdfs-1360.txt TestBlockRecovery starts up a DN, but doesn't configure the various ports to be ephemeral, so the test fails if run on a machine where another DN is already running. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1561) BackupNode listens on default host
[ https://issues.apache.org/jira/browse/HDFS-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022426#comment-13022426 ] Hudson commented on HDFS-1561: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) BackupNode listens on default host -- Key: HDFS-1561 URL: https://issues.apache.org/jira/browse/HDFS-1561 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.21.0 Reporter: Konstantin Shvachko Assignee: Konstantin Shvachko Priority: Blocker Fix For: 0.22.0 Attachments: BNAddress.patch, BNAddress.patch Currently BackupNode uses DNS to find its default host name, and then starts RPC server listening on that address ignoring the address specified in the configuration. Therefore, there is no way to start BackupNode on a particular ip or host address. BackupNode should use the address specified in the configuration instead. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1666) TestAuthorizationFilter is failing
[ https://issues.apache.org/jira/browse/HDFS-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022428#comment-13022428 ] Hudson commented on HDFS-1666: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) HDFS-1666. Disable failing hdfsproxy test TestAuthorizationFilter. Contributed by Todd Lipcon TestAuthorizationFilter is failing -- Key: HDFS-1666 URL: https://issues.apache.org/jira/browse/HDFS-1666 Project: Hadoop HDFS Issue Type: Bug Components: contrib/hdfsproxy Affects Versions: 0.22.0, 0.23.0 Reporter: Konstantin Boudnik Priority: Blocker Attachments: hdfs-1666-disable-tests.txt two test cases were failing for a number of builds (see attached logs) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1560) dfs.data.dir permissions should default to 700
[ https://issues.apache.org/jira/browse/HDFS-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022427#comment-13022427 ] Hudson commented on HDFS-1560: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) dfs.data.dir permissions should default to 700 -- Key: HDFS-1560 URL: https://issues.apache.org/jira/browse/HDFS-1560 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.22.0, 0.23.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Minor Fix For: 0.22.0, 0.23.0 Attachments: hdfs-1560.txt Currently, dfs.data.dir defaults to 755 permissions, which isn't necessary for any reason, and is a security issue if not changed on a secured cluster. We should default to 700 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1602) Fix HADOOP-4885 for it is doesn't work as expected.
[ https://issues.apache.org/jira/browse/HDFS-1602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022433#comment-13022433 ] Hudson commented on HDFS-1602: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Fix HADOOP-4885 for it is doesn't work as expected. --- Key: HDFS-1602 URL: https://issues.apache.org/jira/browse/HDFS-1602 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.21.0, 0.23.0 Reporter: Konstantin Boudnik Assignee: Boris Shkolnik Fix For: 0.22.0 Attachments: HDFS-1602-1.patch, HDFS-1602.patch, HDFS-1602v22.patch NameNode storage restore functionality doesn't work (as HDFS-903 demonstrated). This needs to be either disabled, or removed, or fixed. This feature also fails HDFS-1496 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1406) TestCLI fails on Ubuntu with default /etc/hosts
[ https://issues.apache.org/jira/browse/HDFS-1406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022438#comment-13022438 ] Hudson commented on HDFS-1406: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) TestCLI fails on Ubuntu with default /etc/hosts --- Key: HDFS-1406 URL: https://issues.apache.org/jira/browse/HDFS-1406 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.20.2, 0.21.0, 0.22.0 Reporter: Todd Lipcon Assignee: Konstantin Boudnik Priority: Minor Fix For: 0.20.3 Attachments: HDFS-1406.0.20.patch, HDFS-1406.patch, Test.java Depending on the order of entries in /etc/hosts, TestCLI can fail. This is because it sets fs.default.name to localhost, and then the bound IPC socket on the NN side reports its hostname as foobar-host if the entry for 127.0.0.1 lists foobar-host before localhost. This seems to be the default in some versions of Ubuntu. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1665) Balancer sleeps inadequately
[ https://issues.apache.org/jira/browse/HDFS-1665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022429#comment-13022429 ] Hudson commented on HDFS-1665: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Balancer sleeps inadequately Key: HDFS-1665 URL: https://issues.apache.org/jira/browse/HDFS-1665 Project: Hadoop HDFS Issue Type: Bug Components: balancer Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Priority: Minor Fix For: 0.21.1, Federation Branch, 0.22.0, 0.23.0 Attachments: h1665_20110225.patch, h1665_20110225b.patch, h1665_20110225b_fed.patch The value of {{dfs.heartbeat.interval}} is in seconds. Balancer seems misused it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1606) Provide a stronger data guarantee in the write pipeline
[ https://issues.apache.org/jira/browse/HDFS-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022432#comment-13022432 ] Hudson commented on HDFS-1606: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Provide a stronger data guarantee in the write pipeline --- Key: HDFS-1606 URL: https://issues.apache.org/jira/browse/HDFS-1606 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, hdfs client, name-node Affects Versions: 0.23.0 Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.23.0 Attachments: h1606_20110210.patch, h1606_20110211.patch, h1606_20110217.patch, h1606_20110228.patch, h1606_20110404.patch, h1606_20110405.patch, h1606_20110405b.patch, h1606_20110406.patch, h1606_20110406b.patch, h1606_20110407.patch, h1606_20110407b.patch, h1606_20110407c.patch, h1606_20110408.patch, h1606_20110408b.patch In the current design, if there is a datanode/network failure in the write pipeline, DFSClient will try to remove the failed datanode from the pipeline and then continue writing with the remaining datanodes. As a result, the number of datanodes in the pipeline is decreased. Unfortunately, it is possible that DFSClient may incorrectly remove a healthy datanode but leave the failed datanode in the pipeline because failure detection may be inaccurate under erroneous conditions. We propose to have a new mechanism for adding new datanodes to the pipeline in order to provide a stronger data guarantee. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1601) Pipeline ACKs are sent as lots of tiny TCP packets
[ https://issues.apache.org/jira/browse/HDFS-1601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022435#comment-13022435 ] Hudson commented on HDFS-1601: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Pipeline ACKs are sent as lots of tiny TCP packets -- Key: HDFS-1601 URL: https://issues.apache.org/jira/browse/HDFS-1601 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.23.0 Attachments: hdfs-1601.txt, hdfs-1601.txt I noticed in an hbase benchmark that the packet counts in my network monitoring seemed high, so took a short pcap trace and found that each pipeline ACK was being sent as five packets, the first four of which only contain one byte. We should buffer these bytes and send the PipelineAck as one TCP packet. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1600) editsStored.xml cause release audit warning
[ https://issues.apache.org/jira/browse/HDFS-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022436#comment-13022436 ] Hudson commented on HDFS-1600: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) editsStored.xml cause release audit warning --- Key: HDFS-1600 URL: https://issues.apache.org/jira/browse/HDFS-1600 Project: Hadoop HDFS Issue Type: Bug Components: build, test Reporter: Tsz Wo (Nicholas), SZE Assignee: Todd Lipcon Fix For: 0.23.0 Attachments: h1600_20110126.patch, hadoop-1600.txt The file {{src/test/hdfs/org/apache/hadoop/hdfs/tools/offlineEditsViewer/editsStored.xml}} for any new patch. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1189) Quota counts missed between clear quota and set quota
[ https://issues.apache.org/jira/browse/HDFS-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022434#comment-13022434 ] Hudson commented on HDFS-1189: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Quota counts missed between clear quota and set quota - Key: HDFS-1189 URL: https://issues.apache.org/jira/browse/HDFS-1189 Project: Hadoop HDFS Issue Type: Bug Components: name-node Reporter: Kang Xiao Assignee: John George Labels: hdfs, quota Fix For: 0.20.204.0, 0.21.1, Federation Branch, 0.22.0, 0.23.0 Attachments: HDFS-1189-for_20.204.patch, HDFS-1189-for_20.204.patch, HDFS-1189.patch, HDFS-1189.patch, HDFS-1189.patch, hdfs-1189-1.patch HDFS Quota counts will be missed between a clear quota operation and a set quota. When setting quota for a dir, the INodeDirectory will be replaced by INodeDirectoryWithQuota and dir.isQuotaSet() becomes true. When INodeDirectoryWithQuota is newly created, quota counting will be performed. However, when clearing quota, the quota conf is set to -1 and dir.isQuotaSet() becomes false while INodeDirectoryWithQuota will NOT be replaced back to INodeDirectory. FSDirectory.updateCount just update the quota count for inodes that isQuotaSet() is true. So after clear quota for a dir, its quota counts will not be updated and it's reasonable. But when re seting quota for this dir, quota counting will not be performed and some counts will be missed. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1550) NPE when listing a file with no location
[ https://issues.apache.org/jira/browse/HDFS-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022437#comment-13022437 ] Hudson commented on HDFS-1550: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) NPE when listing a file with no location Key: HDFS-1550 URL: https://issues.apache.org/jira/browse/HDFS-1550 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.22.0 Reporter: Hairong Kuang Assignee: Hairong Kuang Priority: Blocker Fix For: 0.22.0 Attachments: nullLocatedBlocks.patch Lines listed below will caused a NullPointerException in DFSUtil.locatedBlocks2Locations (line 208) because EMPTY_BLOCK_LOCS will return null when calling blocks.getLocatedBlocks() {noformat} /** a default LocatedBlocks object, its content should not be changed */ private final static LocatedBlocks EMPTY_BLOCK_LOCS = new LocatedBlocks(); {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1806) TestBlockReport.blockReport_08() and _09() are timing-dependent and likely to fail on fast servers
[ https://issues.apache.org/jira/browse/HDFS-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022430#comment-13022430 ] Hudson commented on HDFS-1806: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) HDFS-1806. TestBlockReport.blockReport_08() and _09() are timing-dependent and likely to fail on fast servers. Contributed by Matt Foley. TestBlockReport.blockReport_08() and _09() are timing-dependent and likely to fail on fast servers -- Key: HDFS-1806 URL: https://issues.apache.org/jira/browse/HDFS-1806 Project: Hadoop HDFS Issue Type: Bug Components: data-node, name-node Affects Versions: 0.22.0, 0.23.0 Reporter: Matt Foley Assignee: Matt Foley Fix For: 0.22.0, 0.23.0 Attachments: TestBlockReport.java.patch, blockReport_08_failure_log.html Method waitForTempReplica() polls every 100ms during block replication, attempting to catch a datanode in the state of having a TEMPORARY replica. But examination of a current Hudson test failure log shows that the replica goes from start to TEMPORARY to FINALIZED in only 50ms, so of course the poll usually misses it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1502) TestBlockRecovery triggers NPE in assert
[ https://issues.apache.org/jira/browse/HDFS-1502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022442#comment-13022442 ] Hudson commented on HDFS-1502: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) TestBlockRecovery triggers NPE in assert Key: HDFS-1502 URL: https://issues.apache.org/jira/browse/HDFS-1502 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0 Reporter: Eli Collins Assignee: Hairong Kuang Priority: Minor Fix For: 0.22.0, 0.23.0 Attachments: HDFS-1502.patch, fixTestBlockRecovery.patch {noformat} Testcase: testRBW_RWRReplicas took 10.333 sec Caused an ERROR null java.lang.NullPointerException at org.apache.hadoop.hdfs.server.datanode.DataNode.syncBlock(DataNode.java:1881) at org.apache.hadoop.hdfs.server.datanode.TestBlockRecovery.testSyncReplicas(TestBlockRecovery.java:144) at org.apache.hadoop.hdfs.server.datanode.TestBlockRecovery.testRBW_RWRReplicas(TestBlockRecovery.java:305) {noformat} {noformat} Block reply = r.datanode.updateReplicaUnderRecovery( r.rInfo, recoveryId, newBlock.getNumBytes()); assert reply.equals(newBlock) reply.getNumBytes() == newBlock.getNumBytes() : Updated replica must be the same as the new block.;- line 1881 {noformat} Not sure how reply could be null since updateReplicaUnderRecovery always returns a newly instantiated object. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1506) Refactor fsimage loading code
[ https://issues.apache.org/jira/browse/HDFS-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022444#comment-13022444 ] Hudson commented on HDFS-1506: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Refactor fsimage loading code - Key: HDFS-1506 URL: https://issues.apache.org/jira/browse/HDFS-1506 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.23.0 Reporter: Hairong Kuang Assignee: Hairong Kuang Fix For: 0.23.0 Attachments: refactorImageLoader.patch, refactorImageLoader1.patch I plan to do some code refactoring to make HDFS-1070 simpler. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1559) Add missing UGM overrides to TestRefreshUserMappings
[ https://issues.apache.org/jira/browse/HDFS-1559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022446#comment-13022446 ] Hudson commented on HDFS-1559: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Add missing UGM overrides to TestRefreshUserMappings Key: HDFS-1559 URL: https://issues.apache.org/jira/browse/HDFS-1559 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: hdfs-1559.txt The commit of HADOOP-6864 added new methods to GroupMappingServiceProvider and broke trunk compilation for HDFS. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1557) Separate Storage from FSImage
[ https://issues.apache.org/jira/browse/HDFS-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022443#comment-13022443 ] Hudson commented on HDFS-1557: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Separate Storage from FSImage - Key: HDFS-1557 URL: https://issues.apache.org/jira/browse/HDFS-1557 Project: Hadoop HDFS Issue Type: Sub-task Components: name-node Affects Versions: 0.21.0 Reporter: Ivan Kelly Assignee: Ivan Kelly Fix For: 0.23.0 Attachments: 1557-suggestions.txt, HDFS-1557-branch-0.22.diff, HDFS-1557-branch-0.22.diff, HDFS-1557-trunk.diff, HDFS-1557-trunk.diff, HDFS-1557-trunk.diff, HDFS-1557.diff, HDFS-1557.diff, HDFS-1557.diff, HDFS-1557.diff, HDFS-1557.diff, HDFS-1557.diff, HDFS-1557.diff, HDFS-1557.diff, HDFS-1557.diff, HDFS-1557.diff FSImage currently derives from Storage and FSEditLog has to call methods directly on FSImage to access the filesystem. This JIRA is to separate the Storage class out into NNStorage so that FSEditLog is less dependent on FSImage. From this point, the other parts of the circular dependency should be easy to fix. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-884) DataNode makeInstance should report the directory list when failing to start up
[ https://issues.apache.org/jira/browse/HDFS-884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022448#comment-13022448 ] Hudson commented on HDFS-884: - Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) DataNode makeInstance should report the directory list when failing to start up --- Key: HDFS-884 URL: https://issues.apache.org/jira/browse/HDFS-884 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Affects Versions: 0.22.0 Reporter: Steve Loughran Assignee: Steve Loughran Priority: Minor Fix For: 0.22.0 Attachments: HDFS-884.patch, HDFS-884.patch, InvalidDirs.patch, InvalidDirs.patch, InvalidDirs.patch When {{Datanode.makeInstance()}} cannot work with one of the directories in dfs.data.dir, it logs this at warn level (while losing the stack trace). It should include the nested exception for better troubleshooting. Then, when all dirs in the list fail, an exception is thrown, but this exception does not include the list of directories. It should list the absolute path of every missing/failing directory, so that whoever sees the exception can see where to start looking for problems: either the filesystem or the configuration. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1540) Make Datanode handle errors to namenode.register call more elegantly
[ https://issues.apache.org/jira/browse/HDFS-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022451#comment-13022451 ] Hudson commented on HDFS-1540: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Make Datanode handle errors to namenode.register call more elegantly Key: HDFS-1540 URL: https://issues.apache.org/jira/browse/HDFS-1540 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.20.1, 0.20.2, 0.21.0 Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.23.0 Attachments: datanodeException1.txt, datanodeException2.txt, datanodeException3.txt, datanodeException4.txt, datanodeException5.txt, datanodeException5.txt When a datanode receives a Connection reset by peer from the namenode.register(), it exits. This causes many datanodes to die. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1541) Not marking datanodes dead When namenode in safemode
[ https://issues.apache.org/jira/browse/HDFS-1541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022450#comment-13022450 ] Hudson commented on HDFS-1541: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Not marking datanodes dead When namenode in safemode Key: HDFS-1541 URL: https://issues.apache.org/jira/browse/HDFS-1541 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.23.0 Reporter: Hairong Kuang Assignee: Hairong Kuang Fix For: 0.20.204.0, 0.23.0 Attachments: deadnodescheck.patch, deadnodescheck1.patch, deadnodescheck1_0.20-security.patch In a big cluster, when namenode starts up, it takes a long time for namenode to process block reports from all datanodes. Because heartbeats processing get delayed, some datanodes are erroneously marked as dead, then later on they have to register again, thus wasting time. It would speed up starting time if the checking of dead nodes is disabled when namenode in safemode. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1551) fix the pom template's version
[ https://issues.apache.org/jira/browse/HDFS-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022441#comment-13022441 ] Hudson commented on HDFS-1551: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) fix the pom template's version -- Key: HDFS-1551 URL: https://issues.apache.org/jira/browse/HDFS-1551 Project: Hadoop HDFS Issue Type: Bug Components: build Affects Versions: 0.23.0 Reporter: Giridharan Kesavan Assignee: Giridharan Kesavan Fix For: 0.23.0 Attachments: hdfs-1551.patch pom templates in the ivy folder should be updated to the latest version hadoo-common dependencies. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1552) Remove java5 dependencies from build
[ https://issues.apache.org/jira/browse/HDFS-1552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022440#comment-13022440 ] Hudson commented on HDFS-1552: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Remove java5 dependencies from build Key: HDFS-1552 URL: https://issues.apache.org/jira/browse/HDFS-1552 Project: Hadoop HDFS Issue Type: Bug Components: build Affects Versions: 0.21.1, Federation Branch Reporter: Konstantin Boudnik Assignee: Konstantin Boudnik Fix For: 0.21.1 Attachments: HDFS-1552.patch As the first short-term step let's remove JDK5 dependency from build(s) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1486) Generalize CLITest structure and interfaces to facilitate upstream adoption (e.g. for web testing)
[ https://issues.apache.org/jira/browse/HDFS-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022447#comment-13022447 ] Hudson commented on HDFS-1486: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Generalize CLITest structure and interfaces to facilitate upstream adoption (e.g. for web testing) -- Key: HDFS-1486 URL: https://issues.apache.org/jira/browse/HDFS-1486 Project: Hadoop HDFS Issue Type: Improvement Components: test Affects Versions: 0.23.0 Reporter: Konstantin Boudnik Assignee: Konstantin Boudnik Fix For: 0.23.0 Attachments: HDFS-1486.patch, HDFS-1486.patch, HDFS-1486.patch, HDFS-1486.patch, HDFS-1486.patch HDFS part of HADOOP-7014. HDFS side of TestCLI doesn't require any special changes but needs to be aligned with Common -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1504) FSImageSaver should catch all exceptions, not just IOE
[ https://issues.apache.org/jira/browse/HDFS-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022445#comment-13022445 ] Hudson commented on HDFS-1504: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) FSImageSaver should catch all exceptions, not just IOE -- Key: HDFS-1504 URL: https://issues.apache.org/jira/browse/HDFS-1504 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Minor Fix For: 0.22.0 Attachments: hdfs-1504.txt FSImageSaver currently just catches IOE. This means that if some other error like OOME or failed assert happens in saving one of the images, the coordinating thread won't know there was a problem. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1476) listCorruptFileBlocks should be functional while the name node is still in safe mode
[ https://issues.apache.org/jira/browse/HDFS-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022458#comment-13022458 ] Hudson commented on HDFS-1476: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) listCorruptFileBlocks should be functional while the name node is still in safe mode Key: HDFS-1476 URL: https://issues.apache.org/jira/browse/HDFS-1476 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.23.0 Reporter: Patrick Kling Assignee: Patrick Kling Fix For: 0.23.0 Attachments: HDFS-1476.2.patch, HDFS-1476.3.patch, HDFS-1476.4.patch, HDFS-1476.5.patch, HDFS-1476.patch This would allow us to detect whether missing blocks can be fixed using Raid and if that is the case exit safe mode earlier. One way to make listCorruptFileBlocks available before the name node has exited from safe mode would be to perform a scan of the blocks map on each call to listCorruptFileBlocks to determine if there are any blocks with no replicas. This scan could be parallelized by dividing the space of block IDs into multiple intervals than can be scanned independently. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1548) Fault-injection tests are executed multiple times if invoked with run-test-hdfs-fault-inject target
[ https://issues.apache.org/jira/browse/HDFS-1548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022454#comment-13022454 ] Hudson commented on HDFS-1548: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Fault-injection tests are executed multiple times if invoked with run-test-hdfs-fault-inject target --- Key: HDFS-1548 URL: https://issues.apache.org/jira/browse/HDFS-1548 Project: Hadoop HDFS Issue Type: Bug Components: build, test Affects Versions: 0.21.1 Reporter: Konstantin Boudnik Assignee: Konstantin Boudnik Fix For: 0.21.1 Attachments: HDFS-1548.patch, HDFS-1548.patch, HDFS-1548.patch When invoked with {{run-test-hdfs-fault-inject target}} fault injection tests are getting executed 4 times. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1511) 98 Release Audit warnings on trunk and branch-0.22
[ https://issues.apache.org/jira/browse/HDFS-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022455#comment-13022455 ] Hudson commented on HDFS-1511: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) 98 Release Audit warnings on trunk and branch-0.22 -- Key: HDFS-1511 URL: https://issues.apache.org/jira/browse/HDFS-1511 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0, 0.23.0 Reporter: Nigel Daley Assignee: Jakob Homan Priority: Blocker Fix For: 0.22.0, 0.23.0 Attachments: HDFS-1511.patch, HDFS-1511.patch, HDFS-1511.patch, releaseauditWarnings.txt There are 98 release audit warnings on trunk. See attached txt file. These must be fixed or filtered out to get back to a reasonably small number of warnings. The OK_RELEASEAUDIT_WARNINGS property in src/test/test-patch.properties should also be set appropriately in the patch that fixes this issue. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1543) Reduce dev. cycle time by moving system testing artifacts from default build and push to maven for HDFS
[ https://issues.apache.org/jira/browse/HDFS-1543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022452#comment-13022452 ] Hudson commented on HDFS-1543: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Reduce dev. cycle time by moving system testing artifacts from default build and push to maven for HDFS --- Key: HDFS-1543 URL: https://issues.apache.org/jira/browse/HDFS-1543 Project: Hadoop HDFS Issue Type: Bug Reporter: Arun C Murthy Assignee: Luke Lu Fix For: 0.20.3 Attachments: HDFS-1543.patch, hdfs-1543-trunk-v1.patch, hdfs-1543-trunk-v2.patch The current build always generates system testing artifacts and pushes them to Maven. Most developers have no need for these artifacts and no users need them. Also, fault injection tests seems to be running multiple times which increases the length of testing. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1473) Refactor storage management into separate classes than fsimage file reading/writing
[ https://issues.apache.org/jira/browse/HDFS-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022457#comment-13022457 ] Hudson commented on HDFS-1473: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Refactor storage management into separate classes than fsimage file reading/writing --- Key: HDFS-1473 URL: https://issues.apache.org/jira/browse/HDFS-1473 Project: Hadoop HDFS Issue Type: Sub-task Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.22.0, 0.23.0 Attachments: hdfs-1473-followup.2.txt, hdfs-1473-followup.3.txt, hdfs-1473-followup.txt, hdfs-1473-prelim.txt, hdfs-1473.txt, hdfs-1473.txt, hdfs-1473.txt Currently the FSImage class is responsible both for storage management (eg moving around files, tracking file names, the VERSION file, etc) as well as for the actual serialization and deserialization of the fsimage file within the storage directory. I'd like to refactor the loading and saving code into new classes. This will make testing easier and also make the major changes in HDFS-1073 easier to understand. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1335) HDFS side of HADOOP-6904: first step towards inter-version communications between dfs client and NameNode
[ https://issues.apache.org/jira/browse/HDFS-1335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022459#comment-13022459 ] Hudson commented on HDFS-1335: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) HDFS side of HADOOP-6904: first step towards inter-version communications between dfs client and NameNode - Key: HDFS-1335 URL: https://issues.apache.org/jira/browse/HDFS-1335 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs client, name-node Affects Versions: 0.22.0 Reporter: Hairong Kuang Assignee: Hairong Kuang Fix For: 0.23.0 Attachments: hdfsRPC.patch, hdfsRpcVersion.patch The idea is that for getProtocolVersion, NameNode checks if the client and server versions are compatible if the server version is greater than the client version. If no, throws a VersionIncompatible exception; otherwise, returns the server version. On the dfs client side, when creating a NameNode proxy, catches the VersionMismatch exception and then checks if the client version and the server version are compatible if the client version is greater than the server version. If not compatible, throws exception VersionIncomptible; otherwise, records the server version and continues. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1531) Clean up stack traces due to duplicate MXBean registration
[ https://issues.apache.org/jira/browse/HDFS-1531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022462#comment-13022462 ] Hudson commented on HDFS-1531: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Clean up stack traces due to duplicate MXBean registration -- Key: HDFS-1531 URL: https://issues.apache.org/jira/browse/HDFS-1531 Project: Hadoop HDFS Issue Type: Bug Components: data-node, name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Trivial Fix For: 0.22.0 Attachments: hdfs-1531.txt In the minicluster unit tests, we try to register MXBeans for each DN, but since the JMX context is JVM-wide, we get a InstanceAlreadyExistsException for all but the first. This stack trace clutters test logs a lot. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1521) Persist transaction ID on disk between NN restarts
[ https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022461#comment-13022461 ] Hudson commented on HDFS-1521: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Persist transaction ID on disk between NN restarts -- Key: HDFS-1521 URL: https://issues.apache.org/jira/browse/HDFS-1521 Project: Hadoop HDFS Issue Type: Sub-task Components: name-node Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: Edit log branch (HDFS-1073) Attachments: FSImageFormat.patch, HDFS-1521.diff, HDFS-1521.diff, HDFS-1521.diff, HDFS-1521.diff, HDFS-1521.diff, hdfs-1521.3.txt, hdfs-1521.4.txt, hdfs-1521.5.txt, hdfs-1521.txt, hdfs-1521.txt, hdfs-1521.txt For HDFS-1073 and other future work, we'd like to have the concept of a transaction ID that is persisted on disk with the image/edits. We already have this concept in the NameNode but it resets to 0 on restart. We can also use this txid to replace the _checkpointTime_ field, I believe. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1536) Improve HDFS WebUI
[ https://issues.apache.org/jira/browse/HDFS-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022467#comment-13022467 ] Hudson commented on HDFS-1536: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Improve HDFS WebUI -- Key: HDFS-1536 URL: https://issues.apache.org/jira/browse/HDFS-1536 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.23.0 Reporter: Hairong Kuang Assignee: Hairong Kuang Fix For: 0.23.0 Attachments: missingBlocksWebUI.patch, missingBlocksWebUI1.patch 1. Make the missing blocks count accurate; 2. Make the under replicated blocks count excluding missing blocks. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1120) Make DataNode's block-to-device placement policy pluggable
[ https://issues.apache.org/jira/browse/HDFS-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022469#comment-13022469 ] Hudson commented on HDFS-1120: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Make DataNode's block-to-device placement policy pluggable -- Key: HDFS-1120 URL: https://issues.apache.org/jira/browse/HDFS-1120 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Reporter: Jeff Hammerbacher Assignee: Harsh J Chouraria Fix For: 0.23.0 Attachments: HDFS-1120.r1.diff, HDFS-1120.r2.diff, HDFS-1120.r3.diff, HDFS-1120.r4.diff As discussed on the mailing list, as the number of disk drives per server increases, it would be useful to allow the DataNode's policy for new block placement to grow in sophistication from the current round-robin strategy. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1534) Fix some incorrect logs in FSDirectory
[ https://issues.apache.org/jira/browse/HDFS-1534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022471#comment-13022471 ] Hudson commented on HDFS-1534: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Fix some incorrect logs in FSDirectory -- Key: HDFS-1534 URL: https://issues.apache.org/jira/browse/HDFS-1534 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Eli Collins Assignee: Eli Collins Priority: Minor Fix For: 0.23.0 Attachments: hdfs-1534-1.patch FSDirectory#removeBlock has the wrong debug log, it copied it from the add block log. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1728) SecondaryNameNode.checkpointSize is in byte but not MB.
[ https://issues.apache.org/jira/browse/HDFS-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022470#comment-13022470 ] Hudson commented on HDFS-1728: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) SecondaryNameNode.checkpointSize is in byte but not MB. --- Key: HDFS-1728 URL: https://issues.apache.org/jira/browse/HDFS-1728 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.21.1 Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Priority: Minor Fix For: 0.21.1, Federation Branch, 0.22.0, 0.23.0 Attachments: h1728_20110307.patch, h1728_20110307_0.21.patch The unit of SecondaryNameNode.checkpointSize is byte but not MB as stated in the following comment. {code} //SecondaryNameNode.java private long checkpointSize;// size (in MB) of current Edit Log {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1675) Transfer RBW between datanodes
[ https://issues.apache.org/jira/browse/HDFS-1675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022468#comment-13022468 ] Hudson commented on HDFS-1675: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Transfer RBW between datanodes -- Key: HDFS-1675 URL: https://issues.apache.org/jira/browse/HDFS-1675 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.23.0 Attachments: h1675_20110228.patch, h1675_20110228b.patch, h1675_20110308.patch, h1675_20110310.patch, h1675_20110310b.patch, h1675_20110310c.patch, h1675_20110311.patch, h1675_20110313.patch This is the step \(*) described [here|https://issues.apache.org/jira/browse/HDFS-1606?focusedCommentId=12991321page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12991321]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1529) Incorrect handling of interrupts in waitForAckedSeqno can cause deadlock
[ https://issues.apache.org/jira/browse/HDFS-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022463#comment-13022463 ] Hudson commented on HDFS-1529: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Incorrect handling of interrupts in waitForAckedSeqno can cause deadlock Key: HDFS-1529 URL: https://issues.apache.org/jira/browse/HDFS-1529 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Blocker Fix For: 0.22.0 Attachments: Test.java, hdfs-1529.txt, hdfs-1529.txt, hdfs-1529.txt In HDFS-895 the handling of interrupts during hflush/close was changed to preserve interrupt status. This ends up creating an infinite loop in waitForAckedSeqno if the waiting thread gets interrupted, since Object.wait() has a strange semantic that it doesn't give up the lock even momentarily if the thread is already in interrupted state at the beginning of the call. We should decide what the correct behavior is here - if a thread is interrupted while it's calling hflush() or close() should we (a) throw an exception, perhaps InterruptedIOException (b) ignore, or (c) wait for the flush to finish but preserve interrupt status on exit? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1527) SocketOutputStream.transferToFully fails for blocks = 2GB on 32 bit JVM
[ https://issues.apache.org/jira/browse/HDFS-1527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022465#comment-13022465 ] Hudson commented on HDFS-1527: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) SocketOutputStream.transferToFully fails for blocks = 2GB on 32 bit JVM Key: HDFS-1527 URL: https://issues.apache.org/jira/browse/HDFS-1527 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.22.0 Environment: 32 bit JVM Reporter: Patrick Kling Assignee: Patrick Kling Fix For: 0.22.0 Attachments: HDFS-1527.2.patch, HDFS-1527.patch On 32 bit JVM, SocketOutputStream.transferToFully() fails if the block size is = 2GB. We should fall back to a normal transfer in this case. {code} 2010-12-02 19:04:23,490 ERROR datanode.DataNode (BlockSender.java:sendChunks(399)) - BlockSender.sendChunks() exception: java.io.IOException: Value too large for defined data type at sun.nio.ch.FileChannelImpl.transferTo0(Native Method) at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:418) at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:519) at org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:204) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:386) at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:475) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.opReadBlock(DataXceiver.java:196) at org.apache.hadoop.hdfs.protocol.DataTransferProtocol$Receiver.opReadBlock(DataTransferProtocol.java:356) at org.apache.hadoop.hdfs.protocol.DataTransferProtocol$Receiver.processOp(DataTransferProtocol.java:328) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:130) at java.lang.Thread.run(Thread.java:619) {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1533) A more elegant FileSystem#listCorruptFileBlocks API (HDFS portion)
[ https://issues.apache.org/jira/browse/HDFS-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022472#comment-13022472 ] Hudson commented on HDFS-1533: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) A more elegant FileSystem#listCorruptFileBlocks API (HDFS portion) -- Key: HDFS-1533 URL: https://issues.apache.org/jira/browse/HDFS-1533 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.0 Reporter: Patrick Kling Assignee: Patrick Kling Fix For: 0.23.0 Attachments: HDFS-1533.2.patch, HDFS-1533.patch This is the HDFS portion of HADOOP-7060. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1523) TestLargeBlock is failing on trunk
[ https://issues.apache.org/jira/browse/HDFS-1523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022473#comment-13022473 ] Hudson commented on HDFS-1523: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) TestLargeBlock is failing on trunk -- Key: HDFS-1523 URL: https://issues.apache.org/jira/browse/HDFS-1523 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.22.0 Reporter: Konstantin Boudnik Assignee: Konstantin Boudnik Fix For: 0.22.0, 0.23.0 Attachments: HDFS-1523.patch, HDFS-1523.patch TestLargeBlock is failing for more than a week not on 0.22 and trunk with {noformat} java.io.IOException: Premeture EOF from inputStream at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:118) at org.apache.hadoop.hdfs.BlockReader.readChunk(BlockReader.java:275) {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1797) New findbugs warning introduced by HDFS-1120
[ https://issues.apache.org/jira/browse/HDFS-1797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022475#comment-13022475 ] Hudson commented on HDFS-1797: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) New findbugs warning introduced by HDFS-1120 Key: HDFS-1797 URL: https://issues.apache.org/jira/browse/HDFS-1797 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.23.0 Attachments: hdfs-1797.txt HDFS-1120 introduced a new findbugs warning: Unread field: org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolumeSet.curVolume This JIRA is to fix the simple error. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1731) Allow using a file to exclude certain tests from build
[ https://issues.apache.org/jira/browse/HDFS-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022474#comment-13022474 ] Hudson commented on HDFS-1731: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Allow using a file to exclude certain tests from build -- Key: HDFS-1731 URL: https://issues.apache.org/jira/browse/HDFS-1731 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Minor Fix For: 0.23.0 Attachments: hdfs-1731-cygwin-fix.txt, hdfs-1731.txt It would be nice to be able to exclude certain tests when running builds. For example, when a test is known flaky, you may want to exclude it from the main Hudson job, but not actually disable it in the codebase (so that it still runs as part of another Hudson job, for example). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1630) Checksum fsedits
[ https://issues.apache.org/jira/browse/HDFS-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022476#comment-13022476 ] Hudson commented on HDFS-1630: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Checksum fsedits Key: HDFS-1630 URL: https://issues.apache.org/jira/browse/HDFS-1630 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Hairong Kuang Assignee: Hairong Kuang Fix For: 0.23.0 Attachments: editsChecksum.patch, editsChecksum1.patch, editsChecksum2.patch HDFS-903 calculates a MD5 checksum to a saved image, so that we could verify the integrity of the image at the loading time. The other half of the story is how to verify fsedits. Similarly we could use the checksum approach. But since a fsedit file is growing constantly, a checksum per file does not work. I am thinking to add a checksum per transaction. Is it doable or too expensive? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1736) Break dependency between DatanodeJspHelper and FsShell
[ https://issues.apache.org/jira/browse/HDFS-1736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022477#comment-13022477 ] Hudson commented on HDFS-1736: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Break dependency between DatanodeJspHelper and FsShell -- Key: HDFS-1736 URL: https://issues.apache.org/jira/browse/HDFS-1736 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Affects Versions: 0.22.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Minor Labels: hadoop Fix For: 0.22.0 Attachments: HDFS-1736.patch Original Estimate: 24h Remaining Estimate: 24h DatanodeJspHelper has an artificial dependency on a date formatter field in FsShell. A pending bug is reorganizing the FsShell commands so this field will no longer be available. The dependency should be broken by having DataNodeJspHelper contain its own independent date formatter. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1738) change hdfs jmxget to return an empty string instead of null when an attribute value is not available.
[ https://issues.apache.org/jira/browse/HDFS-1738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022478#comment-13022478 ] Hudson commented on HDFS-1738: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) change hdfs jmxget to return an empty string instead of null when an attribute value is not available. -- Key: HDFS-1738 URL: https://issues.apache.org/jira/browse/HDFS-1738 Project: Hadoop HDFS Issue Type: Improvement Components: tools Reporter: Tanping Wang Assignee: Tanping Wang Priority: Minor Attachments: HDFS-1738.patch Currently the tool, hdfs jmx returns null in case of an attribute value is not available. A null pointer exception is thrown and the value of the rest attributes are not printed. It makes more sense to return an empty string and continue to print out the values for the rest attributes. Example of current behavior $ hdfs jmxget -server hostname.com -port 8004 -service NameNode,name=NameNodeActivity jmx name: name=NameNodeActivity,service=NameNode tag.ProcessName=NameNode java.lang.NullPointerException at org.apache.hadoop.hdfs.tools.JMXGet.printAllValues(JMXGet.java:106) at org.apache.hadoop.hdfs.tools.JMXGet.main(JMXGet.java:329) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1840) Terminate LeaseChecker when all writing files are closed.
[ https://issues.apache.org/jira/browse/HDFS-1840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022479#comment-13022479 ] Hudson commented on HDFS-1840: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) HDFS-1840. In DFSClient, terminate the lease renewing thread when all files being written are closed for a grace period, and start a new thread when new files are opened for write. Terminate LeaseChecker when all writing files are closed. - Key: HDFS-1840 URL: https://issues.apache.org/jira/browse/HDFS-1840 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs client Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.23.0 Attachments: h1840_20110418.patch, h1840_20110419.patch, h1840_20110419b.patch In {{DFSClient}}, when there are files opened for write, a {{LeaseChecker}} thread is started for updating the leases periodically. However, it never terminates when when all writing files are closed. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1845) symlink comes up as directory after namenode restart
[ https://issues.apache.org/jira/browse/HDFS-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022481#comment-13022481 ] Hudson commented on HDFS-1845: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) HDFS-1845. Symlink comes up as directory after namenode restart. Contributed by John George symlink comes up as directory after namenode restart Key: HDFS-1845 URL: https://issues.apache.org/jira/browse/HDFS-1845 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0, 0.23.0 Reporter: John George Assignee: John George Fix For: 0.22.0, 0.23.0 Attachments: HDFS-1845-2.patch, HDFS-1845-apache-2.patch, HDFS-1845-apache-3.patch, HDFS-1845-apache.patch, hdfs-1845-branch22-1.patch When a symlink is first created, it get added to EditLogs. When namenode is restarted, it reads from this editlog and represents a symlink correctly and saves this information to its image. If the namenode is restarted again, it reads its from this FSImage, but thinks that a symlink is a directory. This is because it uses Block[] blocks to determine if an INode is a directory, a file, or symlink. Since both a directory and a symlink has blocks as null, it thinks that a symlink is a directory. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1844) Move -fs usage tests from hdfs into common
[ https://issues.apache.org/jira/browse/HDFS-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022482#comment-13022482 ] Hudson commented on HDFS-1844: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Move -fs usage tests from hdfs into common -- Key: HDFS-1844 URL: https://issues.apache.org/jira/browse/HDFS-1844 Project: Hadoop HDFS Issue Type: Test Components: test Affects Versions: 0.23.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Fix For: 0.23.0 Attachments: HDFS-1844.patch The -fs usage tests are in hdfs which causes an unnecessary synchronization of a common hdfs bug when changing the text. The usages have no ties to hdfs, so they should be moved into common. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-900) Corrupt replicas are not tracked correctly through block report from DN
[ https://issues.apache.org/jira/browse/HDFS-900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022484#comment-13022484 ] Hudson commented on HDFS-900: - Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Corrupt replicas are not tracked correctly through block report from DN --- Key: HDFS-900 URL: https://issues.apache.org/jira/browse/HDFS-900 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Konstantin Shvachko Priority: Blocker Fix For: 0.22.0 Attachments: log-commented, reportCorruptBlock.patch, to-reproduce.patch This one is tough to describe, but essentially the following order of events is seen to occur: # A client marks one replica of a block to be corrupt by telling the NN about it # Replication is then scheduled to make a new replica of this node # The replication completes, such that there are now 3 good replicas and 1 corrupt replica # The DN holding the corrupt replica sends a block report. Rather than telling this DN to delete the node, the NN instead marks this as a new *good* replica of the block, and schedules deletion on one of the good replicas. I don't know if this is a dataloss bug in the case of 1 corrupt replica with dfs.replication=2, but it seems feasible. I will attach a debug log with some commentary marked by '', plus a unit test patch which I can get to reproduce this behavior reliably. (it's not a proper unit test, just some edits to an existing one to show it) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-613) TestBalancer and TestBlockTokenWithDFS fail Balancer assert
[ https://issues.apache.org/jira/browse/HDFS-613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022486#comment-13022486 ] Hudson commented on HDFS-613: - Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) TestBalancer and TestBlockTokenWithDFS fail Balancer assert --- Key: HDFS-613 URL: https://issues.apache.org/jira/browse/HDFS-613 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.22.0 Reporter: Konstantin Shvachko Assignee: Todd Lipcon Fix For: 0.22.0 Attachments: hdfs-613.txt Running TestBalancer with asserts on. The asserts in {{Balancer.chooseNode()}} is triggered and the test fails. We do not see it in the builds because asserts are off there. So either the assert is irrelevant or there is another bug in the Balancer code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1591) Fix javac, javadoc, findbugs warnings
[ https://issues.apache.org/jira/browse/HDFS-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022489#comment-13022489 ] Hudson commented on HDFS-1591: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Fix javac, javadoc, findbugs warnings - Key: HDFS-1591 URL: https://issues.apache.org/jira/browse/HDFS-1591 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0 Reporter: Po Cheung Assignee: Po Cheung Fix For: 0.22.0 Attachments: hdfs-1591-trunk.patch Split from HADOOP-6642 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-981) test-contrib fails due to test-cactus failure
[ https://issues.apache.org/jira/browse/HDFS-981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022483#comment-13022483 ] Hudson commented on HDFS-981: - Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) test-contrib fails due to test-cactus failure - Key: HDFS-981 URL: https://issues.apache.org/jira/browse/HDFS-981 Project: Hadoop HDFS Issue Type: Test Components: contrib/hdfsproxy Affects Versions: 0.22.0 Reporter: Eli Collins Assignee: Konstantin Boudnik Priority: Blocker Fix For: 0.22.0 Attachments: HDFS-981.patch Relevant output from a recent run http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/232/console [exec] BUILD FAILED [exec] /grid/0/hudson/hudson-slave/workspace/Hdfs-Patch-h5.grid.sp2.yahoo.net/trunk/build.xml:568: The following error occurred while executing this line: [exec] /grid/0/hudson/hudson-slave/workspace/Hdfs-Patch-h5.grid.sp2.yahoo.net/trunk/src/contrib/build.xml:48: The following error occurred while executing this line: [exec] /grid/0/hudson/hudson-slave/workspace/Hdfs-Patch-h5.grid.sp2.yahoo.net/trunk/src/contrib/hdfsproxy/build.xml:292: org.codehaus.cargo.container.ContainerException: Failed to download [http://apache.osuosl.org/tomcat/tomcat-6/v6.0.18/bin/apache-tomcat-6.0.18.zip] -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1597) Batched edit log syncs can reset synctxid throw assertions
[ https://issues.apache.org/jira/browse/HDFS-1597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022490#comment-13022490 ] Hudson commented on HDFS-1597: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Batched edit log syncs can reset synctxid throw assertions -- Key: HDFS-1597 URL: https://issues.apache.org/jira/browse/HDFS-1597 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Blocker Fix For: 0.22.0 Attachments: hdfs-1597.txt, hdfs-1597.txt, hdfs-1597.txt, illustrate-test-failure.txt The top of FSEditLog.logSync has the following assertion: {code} assert editStreams.size() 0 : no editlog streams; {code} which should actually come after checking to see if the sync was already batched in by another thread. This is related to a second bug in which the same case causes synctxid to be reset to 0 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1831) HDFS equivalent of HADOOP-7223 changes to handle FileContext createFlag combinations
[ https://issues.apache.org/jira/browse/HDFS-1831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022491#comment-13022491 ] Hudson commented on HDFS-1831: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) HDFS equivalent of HADOOP-7223 changes to handle FileContext createFlag combinations Key: HDFS-1831 URL: https://issues.apache.org/jira/browse/HDFS-1831 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0, 0.23.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Fix For: 0.23.0 Attachments: HDFS-1831.1.patch, HDFS-1831.4.patch, HDFS-1831.patch During file creation with FileContext, the expected behavior is not clearly defined for combination of createFlag EnumSet. This is HDFS equivalent of HADOOP-7223 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1598) ListPathsServlet excludes .*.crc files
[ https://issues.apache.org/jira/browse/HDFS-1598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022492#comment-13022492 ] Hudson commented on HDFS-1598: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) ListPathsServlet excludes .*.crc files -- Key: HDFS-1598 URL: https://issues.apache.org/jira/browse/HDFS-1598 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.20.2 Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.21.1, Federation Branch, 0.22.0, 0.23.0 Attachments: h1598_20110126.patch, h1598_20110126_0.20.patch The {{.*.crc}} files are excluded by default. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-996) JUnit tests should never depend on anything in conf
[ https://issues.apache.org/jira/browse/HDFS-996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022493#comment-13022493 ] Hudson commented on HDFS-996: - Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) JUnit tests should never depend on anything in conf --- Key: HDFS-996 URL: https://issues.apache.org/jira/browse/HDFS-996 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.20.1 Reporter: Konstantin Boudnik Assignee: Konstantin Boudnik Priority: Blocker Fix For: 0.21.1 Attachments: HDFS-996.patch Similar to MAPREDUCE-1369 we need to make sure that nothing in conf is used in the unit tests. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1833) Refactor BlockReceiver
[ https://issues.apache.org/jira/browse/HDFS-1833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022494#comment-13022494 ] Hudson commented on HDFS-1833: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Refactor BlockReceiver -- Key: HDFS-1833 URL: https://issues.apache.org/jira/browse/HDFS-1833 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Priority: Minor Fix For: 0.23.0 Attachments: h1833_20110412.patch, h1833_20110413.patch There are repeated codes for creating log/error messages in BlockReceiver. Also, some comment in the codes are incorrect, e.g. {code} private int numTargets; // number of downstream datanodes including myself {code} but the count indeed excludes the current datanode. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1596) Move secondary namenode checkpoint configs from core-default.xml to hdfs-default.xml
[ https://issues.apache.org/jira/browse/HDFS-1596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022495#comment-13022495 ] Hudson commented on HDFS-1596: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Move secondary namenode checkpoint configs from core-default.xml to hdfs-default.xml Key: HDFS-1596 URL: https://issues.apache.org/jira/browse/HDFS-1596 Project: Hadoop HDFS Issue Type: Improvement Components: documentation, name-node Reporter: Patrick Angeles Assignee: Harsh J Chouraria Fix For: 0.21.1, 0.22.0, 0.23.0 Attachments: HDFS-7117.r1.diff, HDFS-7117.r2.diff The following configs are in core-default.xml, but are really read by the Secondary Namenode. These should be moved to hdfs-default.xml for consistency. property namefs.checkpoint.dir/name value${hadoop.tmp.dir}/dfs/namesecondary/value descriptionDetermines where on the local filesystem the DFS secondary name node should store the temporary images to merge. If this is a comma-delimited list of directories then the image is replicated in all of the directories for redundancy. /description /property property namefs.checkpoint.edits.dir/name value${fs.checkpoint.dir}/value descriptionDetermines where on the local filesystem the DFS secondary name node should store the temporary edits to merge. If this is a comma-delimited list of directoires then teh edits is replicated in all of the directoires for redundancy. Default value is same as fs.checkpoint.dir /description /property property namefs.checkpoint.period/name value3600/value descriptionThe number of seconds between two periodic checkpoints. /description /property property namefs.checkpoint.size/name value67108864/value descriptionThe size of the current edit log (in bytes) that triggers a periodic checkpoint even if the fs.checkpoint.period hasn't expired. /description /property -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1445) Batch the calls in DataStorage to FileUtil.createHardLink(), so we call it once per directory instead of once per file
[ https://issues.apache.org/jira/browse/HDFS-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022497#comment-13022497 ] Hudson commented on HDFS-1445: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Batch the calls in DataStorage to FileUtil.createHardLink(), so we call it once per directory instead of once per file -- Key: HDFS-1445 URL: https://issues.apache.org/jira/browse/HDFS-1445 Project: Hadoop HDFS Issue Type: Sub-task Components: data-node Affects Versions: 0.20.2 Reporter: Matt Foley Assignee: Matt Foley Fix For: 0.23.0 Attachments: HDFS-1445-trunk.v22_hdfs_2-of-2.patch It was a bit of a puzzle why we can do a full scan of a disk in about 30 seconds during FSDir() or getVolumeMap(), but the same disk took 11 minutes to do Upgrade replication via hardlinks. It turns out that the org.apache.hadoop.fs.FileUtil.createHardLink() method does an outcall to Runtime.getRuntime().exec(), to utilize native filesystem hardlink capability. So it is forking a full-weight external process, and we call it on each individual file to be replicated. As a simple check on the possible cost of this approach, I built a Perl test script (under Linux on a production-class datanode). Perl also uses a compiled and optimized p-code engine, and it has both native support for hardlinks and the ability to do exec. - A simple script to create 256,000 files in a directory tree organized like the Datanode, took 10 seconds to run. - Replicating that directory tree using hardlinks, the same way as the Datanode, took 12 seconds using native hardlink support. - The same replication using outcalls to exec, one per file, took 256 seconds! - Batching the calls, and doing 'exec' once per directory instead of once per file, took 16 seconds. Obviously, your mileage will vary based on the number of blocks per volume. A volume with less than about 4000 blocks will have only 65 directories. A volume with more than 4K and less than about 250K blocks will have 4200 directories (more or less). And there are two files per block (the data file and the .meta file). So the average number of files per directory may vary from 2:1 to 500:1. A node with 50K blocks and four volumes will have 25K files per volume, or an average of about 6:1. So this change may be expected to take it down from, say, 12 minutes per volume to 2. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1750) fs -ls hftp://file not working
[ https://issues.apache.org/jira/browse/HDFS-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022498#comment-13022498 ] Hudson commented on HDFS-1750: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) fs -ls hftp://file not working -- Key: HDFS-1750 URL: https://issues.apache.org/jira/browse/HDFS-1750 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.21.1 Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.20.204.0, 0.21.1, 0.22.0, 0.23.0 Attachments: h1750_20110314.patch, h1750_20110314_0.20-security.patch, h1750_20110314_0.21.patch {noformat} hadoop dfs -touchz /tmp/file1 # create file. OK hadoop dfs -ls /tmp/file1 # OK hadoop dfs -ls hftp://namenode:50070/tmp/file1 # FAILED: not seeing the file {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1611) Some logical issues need to address.
[ https://issues.apache.org/jira/browse/HDFS-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022499#comment-13022499 ] Hudson commented on HDFS-1611: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Some logical issues need to address. Key: HDFS-1611 URL: https://issues.apache.org/jira/browse/HDFS-1611 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client, name-node Affects Versions: 0.20.1, 0.20.2 Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Priority: Minor Fix For: 0.23.0 Attachments: HDFS-1611.1.patch, HDFS-1611.patch Title: Some code level logical issues. Description: 1. DFSClient: Consider the below case, if we enable only info, then below log will never be logged. if (ClientDatanodeProtocol.LOG.isDebugEnabled()) { ClientDatanodeProtocol.LOG.info(ClientDatanodeProtocol addr= + addr); } 2.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerMBean() catch (NotCompliantMBeanException e) { e.printStackTrace(); } We can avoid using stackTace(). Better to add log message. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1448) Create multi-format parser for edits logs file, support binary and XML formats initially
[ https://issues.apache.org/jira/browse/HDFS-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022501#comment-13022501 ] Hudson commented on HDFS-1448: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Create multi-format parser for edits logs file, support binary and XML formats initially Key: HDFS-1448 URL: https://issues.apache.org/jira/browse/HDFS-1448 Project: Hadoop HDFS Issue Type: New Feature Components: tools Affects Versions: 0.22.0 Reporter: Erik Steffl Assignee: Erik Steffl Fix For: 0.23.0 Attachments: HDFS-1448-0.22-1.patch, HDFS-1448-0.22-2.patch, HDFS-1448-0.22-3.patch, HDFS-1448-0.22-4.patch, HDFS-1448-0.22-5.patch, HDFS-1448-0.22.patch, Viewer hierarchy.pdf, editsStored Create multi-format parser for edits logs file, support binary and XML formats initially. Parsing should work from any supported format to any other supported format (e.g. from binary to XML and from XML to binary). The binary format is the format used by FSEditLog class to read/write edits file. Primary reason to develop this tool is to help with troubleshooting, the binary format is hard to read and edit (for human troubleshooters). Longer term it could be used to clean up and minimize parsers for fsimage and edits files. Edits parser OfflineEditsViewer is written in a very similar fashion to OfflineImageViewer. Next step would be to merge OfflineImageViewer and OfflineEditsViewer and use the result in both FSImage and FSEditLog. This is subject to change, specifically depending on adoption of avro (which would completely change how objects are serialized as well as provide ways to convert files to different formats). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1757) Don't compile fuse-dfs by default
[ https://issues.apache.org/jira/browse/HDFS-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022504#comment-13022504 ] Hudson commented on HDFS-1757: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Don't compile fuse-dfs by default - Key: HDFS-1757 URL: https://issues.apache.org/jira/browse/HDFS-1757 Project: Hadoop HDFS Issue Type: Improvement Components: contrib/fuse-dfs Affects Versions: 0.23.0 Reporter: Eli Collins Assignee: Eli Collins Fix For: 0.23.0 Attachments: hdfs-1757-1.patch The infra machines don't have fuse headers, therefore we shouldn't compile fuse-dfs by default. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1588) Add dfs.hosts.exclude to DFSConfigKeys and use constant in stead of hardcoded string
[ https://issues.apache.org/jira/browse/HDFS-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022507#comment-13022507 ] Hudson commented on HDFS-1588: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Add dfs.hosts.exclude to DFSConfigKeys and use constant in stead of hardcoded string Key: HDFS-1588 URL: https://issues.apache.org/jira/browse/HDFS-1588 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.23.0 Reporter: Erik Steffl Assignee: Erik Steffl Fix For: 0.23.0 Attachments: HDFS-1588-0.23.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1585) HDFS-1547 broke MR build
[ https://issues.apache.org/jira/browse/HDFS-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022509#comment-13022509 ] Hudson commented on HDFS-1585: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) HDFS-1547 broke MR build Key: HDFS-1585 URL: https://issues.apache.org/jira/browse/HDFS-1585 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Blocker Fix For: 0.23.0 Attachments: hdfs-1585.txt Added a parameter to startDatanodes without maintaining old API -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1586) Add InterfaceAudience annotation to MiniDFSCluster
[ https://issues.apache.org/jira/browse/HDFS-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022511#comment-13022511 ] Hudson commented on HDFS-1586: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Add InterfaceAudience annotation to MiniDFSCluster -- Key: HDFS-1586 URL: https://issues.apache.org/jira/browse/HDFS-1586 Project: Hadoop HDFS Issue Type: Improvement Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: HDFS-1586.1.patch, HDFS-1586.patch MiniDFSCluster is used both by hdfs and mapreduce. Annotation needs to be added to this class to reflect this. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1618) configure files that are generated as part of the released tarball need to have executable bit set
[ https://issues.apache.org/jira/browse/HDFS-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022510#comment-13022510 ] Hudson commented on HDFS-1618: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) configure files that are generated as part of the released tarball need to have executable bit set --- Key: HDFS-1618 URL: https://issues.apache.org/jira/browse/HDFS-1618 Project: Hadoop HDFS Issue Type: Improvement Components: build Affects Versions: 0.22.0 Reporter: Roman Shaposhnik Assignee: Roman Shaposhnik Fix For: 0.22.0 Attachments: HDFS-1618.patch Currently the configure files that are packaged in a tarball are -rw-rw-r-- -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1824) delay instantiation of file system object until it is needed (linked to HADOOP-7207)
[ https://issues.apache.org/jira/browse/HDFS-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022514#comment-13022514 ] Hudson commented on HDFS-1824: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) delay instantiation of file system object until it is needed (linked to HADOOP-7207) Key: HDFS-1824 URL: https://issues.apache.org/jira/browse/HDFS-1824 Project: Hadoop HDFS Issue Type: Bug Reporter: Boris Shkolnik Assignee: Boris Shkolnik Attachments: HDFS-1824-1-22.patch, HDFS-1824-1.patch, HDFS-1824.patch also re-factor the code little bit to avoid checking for instance of DFS in multiple places. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1582) Remove auto-generated native build files
[ https://issues.apache.org/jira/browse/HDFS-1582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022513#comment-13022513 ] Hudson commented on HDFS-1582: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Remove auto-generated native build files Key: HDFS-1582 URL: https://issues.apache.org/jira/browse/HDFS-1582 Project: Hadoop HDFS Issue Type: Improvement Components: contrib/libhdfs Reporter: Roman Shaposhnik Assignee: Roman Shaposhnik Fix For: 0.22.0, 0.23.0 Attachments: HADOOP-6436.patch, HDFS-1582.diff Original Estimate: 24h Remaining Estimate: 24h The repo currently includes the automake and autoconf generated files for the native build. Per discussion on HADOOP-6421 let's remove them and use the host's automake and autoconf. We should also do this for libhdfs and fuse-dfs. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1821) FileContext.createSymlink with kerberos enabled sets wrong owner
[ https://issues.apache.org/jira/browse/HDFS-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022508#comment-13022508 ] Hudson commented on HDFS-1821: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) FileContext.createSymlink with kerberos enabled sets wrong owner Key: HDFS-1821 URL: https://issues.apache.org/jira/browse/HDFS-1821 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.22.0 Environment: Kerberos enabled on cluster Reporter: John George Assignee: John George Fix For: 0.22.0, 0.23.0 Attachments: HDFS-1821-2.patch, HDFS-1821.patch TEST SETUP Using attached sample hdfs java program that illustrates the issue. Using cluster with Kerberos enabled on cluster # Compile instructions $ javac Symlink.java -cp `hadoop classpath` $ jar -cfe Symlink.jar Symlink Symlink.class # create test file for symlink to use 1. hadoop fs -touchz /user/username/filetest # Create symlink using file context 2. hadoop jar Symlink.jar ln /user/username/filetest /user/username/testsymlink # Verify owner of test file 3. hadoop jar Symlink.jar ls /user/username/ -rw-r--r-- username hdfs /user/jeagles/filetest -rwxrwxrwx usern...@xx..x.xxx hdfs /user/username/testsymlink RESULTS 1. Owner shows 'usern...@xx..x.xxx' for symlink, expecting 'username'. 2. Symlink is corrupted and can't removed, since it was created with an invalid user Sample program to create Symlink FileContext fc = FileContext.getFileContext(getConf()); fc.createSymlink(target, symlink, false); --- -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1817) Split TestFiDataTransferProtocol.java into two files
[ https://issues.apache.org/jira/browse/HDFS-1817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022515#comment-13022515 ] Hudson commented on HDFS-1817: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Split TestFiDataTransferProtocol.java into two files Key: HDFS-1817 URL: https://issues.apache.org/jira/browse/HDFS-1817 Project: Hadoop HDFS Issue Type: Improvement Components: test Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Priority: Trivial Fix For: 0.23.0 Attachments: h1817_20110407.patch {{TestFiDataTransferProtocol}} has tests from pipeline_Fi_01 to _16 and pipeline_Fi_39 to _51. It is natural to split them into two files. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1782) FSNamesystem.startFileInternal(..) throws NullPointerException
[ https://issues.apache.org/jira/browse/HDFS-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022516#comment-13022516 ] Hudson commented on HDFS-1782: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) FSNamesystem.startFileInternal(..) throws NullPointerException -- Key: HDFS-1782 URL: https://issues.apache.org/jira/browse/HDFS-1782 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.22.0 Reporter: John George Assignee: John George Fix For: 0.22.0, 0.23.0 Attachments: HDFS-1782.patch I'm observing when there is one balancer running trying to run another one results in Java.lang.NullPointerException error. I was hoping to see message Another balancer is running. Exiting Exiting This is a reproducible issue. Details 1) Cluster -elrond [hdfs@]$ hadoop version 2) Run first balancer [hdfs]$ hdfs balancer 1 through XX.XX.XX.XX:1004 is succeeded. [hdfs@]$ hdfs balancer 11/03/09 16:34:32 INFO balancer.Balancer: namenodes = java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1400) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1284) at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:779) at sun.reflect.GeneratedMethodAccessor46.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:346) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1399) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1395) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1094) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1393) . Exiting ... Balancing took 1.366 seconds -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1763) Replace hard-coded option strings with variables from DFSConfigKeys
[ https://issues.apache.org/jira/browse/HDFS-1763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022517#comment-13022517 ] Hudson commented on HDFS-1763: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Replace hard-coded option strings with variables from DFSConfigKeys --- Key: HDFS-1763 URL: https://issues.apache.org/jira/browse/HDFS-1763 Project: Hadoop HDFS Issue Type: Improvement Reporter: Eli Collins Assignee: Eli Collins Priority: Minor Fix For: 0.23.0 Attachments: hdfs-1763-1.patch, hdfs-1763-2.patch There are some places in the code where we use hard-coded strings instead of the equivalent DFSConfigKeys define, and a couple places where the default is defined multiple places (once in DFSConfigKeys and once elsewhere, though both have the same value). This is error-prone, and also a pain in that it prevents eclipse from easily showing you all the places where a particular config option is used. Let's replace all the uses of the hard-coded option strings with uses of the corresponding variables in DFSConfigKeys. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1760) problems with getFullPathName
[ https://issues.apache.org/jira/browse/HDFS-1760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022518#comment-13022518 ] Hudson commented on HDFS-1760: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) problems with getFullPathName - Key: HDFS-1760 URL: https://issues.apache.org/jira/browse/HDFS-1760 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.22.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Fix For: 0.23.0 Attachments: HDFS-1760-2.patch, HDFS-1760.patch FSDirectory's getFullPathName method is flawed. Given a list of inodes, it starts at index 1 instead of 0 (based on the assumption that inode[0] is always the root inode) and then builds the string with /+inode[i]. This means the empty string is returned for the root, or when requesting the full path of the parent dir for top level items. In addition, it's not guaranteed that the list of inodes starts with the root inode. The inode lookup routine will only fill the inode array with the last n-many inodes of a path if the array is smaller than the path. In these cases, getFullPathName will skip the first component of the relative path, and then assume the second component starts at the root. ex. a/b/c becomes /b/c. There are a few places in the code where the issue was hacked around by assuming that a 0-length path meant a hardcoded / instead of Path.SEPARATOR. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-560) Proposed enhancements/tuning to hadoop-hdfs/build.xml
[ https://issues.apache.org/jira/browse/HDFS-560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022519#comment-13022519 ] Hudson commented on HDFS-560: - Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Proposed enhancements/tuning to hadoop-hdfs/build.xml -- Key: HDFS-560 URL: https://issues.apache.org/jira/browse/HDFS-560 Project: Hadoop HDFS Issue Type: Improvement Components: build Affects Versions: 0.21.0 Reporter: Steve Loughran Assignee: Steve Loughran Priority: Minor Fix For: 0.23.0 Attachments: HDFS-560.patch sibling list of HADOOP-6206, enhancements to the hdfs build for easier single-system build/test -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1785) Cleanup BlockReceiver and DataXceiver
[ https://issues.apache.org/jira/browse/HDFS-1785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022521#comment-13022521 ] Hudson commented on HDFS-1785: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) Cleanup BlockReceiver and DataXceiver - Key: HDFS-1785 URL: https://issues.apache.org/jira/browse/HDFS-1785 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.23.0 Attachments: h1785_20110324.patch, h1785_20110325.patch {{clientName.length()}} is used multiple times for determining whether the source is a client or a datanode. {code} if (clientName.length() == 0) { //it is a datanode } {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1818) TestHDFSCLI is failing on trunk
[ https://issues.apache.org/jira/browse/HDFS-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022522#comment-13022522 ] Hudson commented on HDFS-1818: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) TestHDFSCLI is failing on trunk --- Key: HDFS-1818 URL: https://issues.apache.org/jira/browse/HDFS-1818 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.23.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers Fix For: 0.23.0 Attachments: hdfs-1818.0.txt The commit of HADOOP-7202 changed the output of a few FsShell commands. Since HDFS tests rely on the precise format of this output, TestHDFSCLI is now failing. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1628) AccessControlException should display the full path
[ https://issues.apache.org/jira/browse/HDFS-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022523#comment-13022523 ] Hudson commented on HDFS-1628: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) AccessControlException should display the full path --- Key: HDFS-1628 URL: https://issues.apache.org/jira/browse/HDFS-1628 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Ramya R Assignee: John George Priority: Minor Fix For: Federation Branch, 0.23.0 Attachments: HDFS-1628.patch, HDFS-1628.patch, HDFS-1628.patch org.apache.hadoop.security.AccessControlException should display the full path for which the access is denied. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1625) TestDataNodeMXBean fails if disk space usage changes during test run
[ https://issues.apache.org/jira/browse/HDFS-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13022524#comment-13022524 ] Hudson commented on HDFS-1625: -- Integrated in Hadoop-Hdfs-trunk #643 (See [https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/643/]) TestDataNodeMXBean fails if disk space usage changes during test run Key: HDFS-1625 URL: https://issues.apache.org/jira/browse/HDFS-1625 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: Todd Lipcon Assignee: Tsz Wo (Nicholas), SZE Priority: Minor Fix For: Federation Branch, 0.22.0, 0.23.0 Attachments: h1625_20110228.patch, h1625_20110301.patch I've seen this on our internal hudson - we get failures like: null expected:...:{freeSpace:857683[43552],usedSpace:28672,... but was:...:{freeSpace:857683[59936],usedSpace:28672,... because some other build on the same build slave used up some disk space during the middle of the test. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira