[jira] Updated: (HDFS-754) Reduce ivy console output to observable level
[ https://issues.apache.org/jira/browse/HDFS-754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Boudnik updated HDFS-754: Summary: Reduce ivy console output to observable level (was: Reduce ivy console output to ovservable level) Reduce ivy console output to observable level - Key: HDFS-754 URL: https://issues.apache.org/jira/browse/HDFS-754 Project: Hadoop HDFS Issue Type: Improvement Reporter: Konstantin Boudnik Assignee: Konstantin Boudnik Fix For: 0.22.0 Attachments: HDFS-754.patch It is very hard to see what's going in the build because ivy is literally flood the console with nonsensical messages... -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-890) Have a way of creating datanodes that throws an meaningful exception on failure
[ https://issues.apache.org/jira/browse/HDFS-890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798759#action_12798759 ] Philip Zeyliger commented on HDFS-890: -- Do you need to mark the old API as deprecated? Isn't the documented way of starting datanodes bin/hdfs datanode? If we can get away with it, fewer deprecated methods are better. I'm +1 having real exit codes. Have a way of creating datanodes that throws an meaningful exception on failure --- Key: HDFS-890 URL: https://issues.apache.org/jira/browse/HDFS-890 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Affects Versions: 0.22.0 Reporter: Steve Loughran Assignee: Steve Loughran In HDFS-884, I proposed printing out more details on why things fail. This is hard to test, because you need to subvert the log4j back end that your test harness will itself have grabbed. There is a way to make it testable, and to make it easier for anyone creating datanodes in process to recognise and handle failure: have a static CreateDatanode() method that throws exceptions when directories cannot be created or other problems arise. Right now some problems trigger failure, others just return a null reference saying something went wrong but we won't tell you what -hope you know where the logs go. The HDFS-884 patch would be replaced by something that threw an exception; the existing methods would catch this, log it and return null. The new method would pass it straight up. This is easier to test, better for others. If people think this is good, I will code it up and mark the old API as deprecated. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HDFS-880) TestNNLeaseRecovery fails on windows
[ https://issues.apache.org/jira/browse/HDFS-880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Boudnik resolved HDFS-880. - Resolution: Fixed Assignee: Konstantin Boudnik Hadoop Flags: [Reviewed] I've committed it to the trunk and 0.21 branch. TestNNLeaseRecovery fails on windows Key: HDFS-880 URL: https://issues.apache.org/jira/browse/HDFS-880 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.21.0 Reporter: Konstantin Shvachko Assignee: Konstantin Boudnik Fix For: 0.21.0 Attachments: HDFS-880.patch, testNNLeaseRecovery.patch TestNNLeaseRecovery fails on windows trying to delete name-node storage directory. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-127) DFSClient block read failures cause open DFSInputStream to become unusable
[ https://issues.apache.org/jira/browse/HDFS-127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798776#action_12798776 ] Tsz Wo (Nicholas), SZE commented on HDFS-127: - @Zlatin: yes, I just have checked 0.20.1 release with the patch (4681.patch). TestFsck failed. As mentioned by Suresh, the patch causes an infinite loop on DFSClient when reading a block with all the replicas corrupted. I don't know much about HBase. So I cannot answer your HBase question. Please check with the HBase mailing lists. DFSClient block read failures cause open DFSInputStream to become unusable -- Key: HDFS-127 URL: https://issues.apache.org/jira/browse/HDFS-127 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Reporter: Igor Bolotin Assignee: Igor Bolotin Fix For: 0.21.0, 0.22.0 Attachments: 4681.patch, h127_20091016.patch, h127_20091019.patch, h127_20091019b.patch We are using some Lucene indexes directly from HDFS and for quite long time we were using Hadoop version 0.15.3. When tried to upgrade to Hadoop 0.19 - index searches started to fail with exceptions like: 2008-11-13 16:50:20,314 WARN [Listener-4] [] DFSClient : DFS Read: java.io.IOException: Could not obtain block: blk_5604690829708125511_15489 file=/usr/collarity/data/urls-new/part-0/20081110-163426/_0.tis at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1708) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1536) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1663) at java.io.DataInputStream.read(DataInputStream.java:132) at org.apache.nutch.indexer.FsDirectory$DfsIndexInput.readInternal(FsDirectory.java:174) at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:152) at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:38) at org.apache.lucene.store.IndexInput.readVInt(IndexInput.java:76) at org.apache.lucene.index.TermBuffer.read(TermBuffer.java:63) at org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java:131) at org.apache.lucene.index.SegmentTermEnum.scanTo(SegmentTermEnum.java:162) at org.apache.lucene.index.TermInfosReader.scanEnum(TermInfosReader.java:223) at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:217) at org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java:54) ... The investigation showed that the root of this issue is that we exceeded # of xcievers in the data nodes and that was fixed by changing configuration settings to 2k. However - one thing that bothered me was that even after datanodes recovered from overload and most of client servers had been shut down - we still observed errors in the logs of running servers. Further investigation showed that fix for HADOOP-1911 introduced another problem - the DFSInputStream instance might become unusable once number of failures over lifetime of this instance exceeds configured threshold. The fix for this specific issue seems to be trivial - just reset failure counter before reading next block (patch will be attached shortly). This seems to be also related to HADOOP-3185, but I'm not sure I really understand necessity of keeping track of failed block accesses in the DFS client. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-880) TestNNLeaseRecovery fails on windows
[ https://issues.apache.org/jira/browse/HDFS-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798782#action_12798782 ] Hudson commented on HDFS-880: - Integrated in Hadoop-Hdfs-trunk-Commit #165 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/165/]) . TestNNLeaseRecovery fails on windows. Contributed by Konstantin Boudnik, Konstantin Shvachko. TestNNLeaseRecovery fails on windows Key: HDFS-880 URL: https://issues.apache.org/jira/browse/HDFS-880 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.21.0 Reporter: Konstantin Shvachko Assignee: Konstantin Boudnik Fix For: 0.21.0 Attachments: HDFS-880.patch, testNNLeaseRecovery.patch TestNNLeaseRecovery fails on windows trying to delete name-node storage directory. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-880) TestNNLeaseRecovery fails on windows
[ https://issues.apache.org/jira/browse/HDFS-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798784#action_12798784 ] Eli Collins commented on HDFS-880: -- bq. Is it possible to instantiate FSNamesystem using the constructor that takes FSImage as a parameter. Just a thought to verify, if it is not easy please submit the patch as it is. I remember Eli was working on mocking FSNamesystem, may he knows how to it better. My first patch for HDFS-669 mocked up FSNamesystem, it's doable. Per our discussion a while back we decided to some unit tests directly against NameNode. Working on finishing up the symlink patches first. TestNNLeaseRecovery fails on windows Key: HDFS-880 URL: https://issues.apache.org/jira/browse/HDFS-880 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.21.0 Reporter: Konstantin Shvachko Assignee: Konstantin Boudnik Fix For: 0.21.0 Attachments: HDFS-880.patch, testNNLeaseRecovery.patch TestNNLeaseRecovery fails on windows trying to delete name-node storage directory. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-880) TestNNLeaseRecovery fails on windows
[ https://issues.apache.org/jira/browse/HDFS-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798788#action_12798788 ] Konstantin Boudnik commented on HDFS-880: - Yes, pure mocking of FSNamesystem is doable, but clearly for this testing's scope it seems to be much easier to spy on it and replace some references with mocks. TestNNLeaseRecovery fails on windows Key: HDFS-880 URL: https://issues.apache.org/jira/browse/HDFS-880 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.21.0 Reporter: Konstantin Shvachko Assignee: Konstantin Boudnik Fix For: 0.21.0 Attachments: HDFS-880.patch, testNNLeaseRecovery.patch TestNNLeaseRecovery fails on windows trying to delete name-node storage directory. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-145) FSNameSystem#addStoredBlock does not handle inconsistent block length correctly
[ https://issues.apache.org/jira/browse/HDFS-145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798833#action_12798833 ] Konstantin Shvachko commented on HDFS-145: -- +1 This indeed looks like a code cleanup. # For the record could you please explain the reason one of the test cases is removed in TestFileCreation. # There one unused import in TestFileCreation.java FSNameSystem#addStoredBlock does not handle inconsistent block length correctly --- Key: HDFS-145 URL: https://issues.apache.org/jira/browse/HDFS-145 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.21.0 Reporter: Hairong Kuang Assignee: Hairong Kuang Fix For: 0.21.0, 0.22.0 Attachments: corruptionDetect.patch, inconsistentLen.patch, inconsistentLen1.patch, inconsistentLen2.patch Currently NameNode treats either the new replica or existing replicas as corrupt if the new replica's length is inconsistent with NN recorded block length. The correct behavior should be 1. For a block that is not under construction, the new replica should be marked as corrupt if its length is inconsistent (no matter shorter or longer) with the NN recorded block length; 2. For an under construction block, if the new replica's length is shorter than the NN recorded block length, the new replica could be marked as corrupt; if the new replica's length is longer, NN should update its recorded block length. But it should not mark existing replicas as corrupt. This is because NN recorded length for an under construction block does not accurately match the block length on datanode disk. NN should not judge an under construction replica to be corrupt by looking at the inaccurate information: its recorded block length. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-879) FileStatus should have the visible length of the file
[ https://issues.apache.org/jira/browse/HDFS-879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798832#action_12798832 ] Zheng Shao commented on HDFS-879: - I mean data node sends heartbeat with latest/visible size to name node. The logic is like the following: 1. When I do a ls -l on an NFS directory that contains data written by some other clients, I should be able to see latest file size with a fixed delay. But in HDFS, I won't be able to see the latest file size until a block is finished (there is no time limit). 2. In order to achieve 1, there are 2 approaches: 2A. Add isUnderConstruction in FileStatus, so that DFSClient can easily know which files are underconstruction, and then DFSClient can go to data nodes to get the latest length pretty efficiently, since most files are not under construction; 2B. Let data node send heartbeats with the latest length, then DFSClient can directly get latest length from the name node, and there is no need for isUnderConstruction. FileStatus should have the visible length of the file - Key: HDFS-879 URL: https://issues.apache.org/jira/browse/HDFS-879 Project: Hadoop HDFS Issue Type: Improvement Reporter: Zheng Shao Assignee: Zheng Shao Currently, {{FileStatus}} returned by {{DistributedFileSystem.listStatus()}} (which goes through {{DFSClient.listPath()}} then {{NameNode.getListing()}}) does not have the latest file length, if the file is still open for write. We should make changes in {{DFSClient.listPath()}} to override the length of the file, if the file is under construction. This depends on adding a {{isUnderConstruction}} field in {{FileStatus}}. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-879) FileStatus should have the visible length of the file
[ https://issues.apache.org/jira/browse/HDFS-879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798844#action_12798844 ] Zheng Shao commented on HDFS-879: - A mistake in 2A above. FileStatus should be DFSFileStatus. Also for 2B, talked with Dhruba offline. A better way for this to send the latest length NOT in heartbeat but in a similar manner as the new block creation event. FileStatus should have the visible length of the file - Key: HDFS-879 URL: https://issues.apache.org/jira/browse/HDFS-879 Project: Hadoop HDFS Issue Type: Improvement Reporter: Zheng Shao Assignee: Zheng Shao Currently, {{FileStatus}} returned by {{DistributedFileSystem.listStatus()}} (which goes through {{DFSClient.listPath()}} then {{NameNode.getListing()}}) does not have the latest file length, if the file is still open for write. We should make changes in {{DFSClient.listPath()}} to override the length of the file, if the file is under construction. This depends on adding a {{isUnderConstruction}} field in {{FileStatus}}. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-879) FileStatus should have the visible length of the file
[ https://issues.apache.org/jira/browse/HDFS-879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798843#action_12798843 ] dhruba borthakur commented on HDFS-879: --- 2A. I liked this better than 2B. Add a new filed called underConstruction to DfsFileStatus (not FileStatus). 2B. I guess Zheng is actually meaning that the datanode can send blockReceived messages even before the block is complete. This message will indicate that ther block is not yet completed, but will carry the most upto-date length of the block. FileStatus should have the visible length of the file - Key: HDFS-879 URL: https://issues.apache.org/jira/browse/HDFS-879 Project: Hadoop HDFS Issue Type: Improvement Reporter: Zheng Shao Assignee: Zheng Shao Currently, {{FileStatus}} returned by {{DistributedFileSystem.listStatus()}} (which goes through {{DFSClient.listPath()}} then {{NameNode.getListing()}}) does not have the latest file length, if the file is still open for write. We should make changes in {{DFSClient.listPath()}} to override the length of the file, if the file is under construction. This depends on adding a {{isUnderConstruction}} field in {{FileStatus}}. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-878) FileStatus should have a field isUnderConstruction
[ https://issues.apache.org/jira/browse/HDFS-878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798869#action_12798869 ] Hairong Kuang commented on HDFS-878: I also like the proposal of having a DFSFileStatus This still adds cost to the list directory operation and needs to be cautious. Zheng and Dhruba, you have not answered my question why it is not good enough that DFSOutputSream provides the visible length and under construction information. Do you have a use case that you have to get those information from getFileStatus operation? FileStatus should have a field isUnderConstruction Key: HDFS-878 URL: https://issues.apache.org/jira/browse/HDFS-878 Project: Hadoop HDFS Issue Type: New Feature Reporter: Zheng Shao Assignee: Zheng Shao Currently DFSClient has no way to know whether a file is under construction or not, unless we open the file and get locatedBlocks (which is much more costly). However, the namenode knows whether each INode is under construction or not. We should expose that information from NameNode.getListing(), to DFSClient.listPaths(), to DistributedFileSystem.listStatus(). We should also expose that information through DFSInputStream and DFSDataInputStream if not there yet. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-145) FSNameSystem#addStoredBlock does not handle inconsistent block length correctly
[ https://issues.apache.org/jira/browse/HDFS-145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hairong Kuang updated HDFS-145: --- Attachment: corruptionDetect1.patch This patch removes the unnecessary import. The reason that I removed the testcase TestFileCreation is that the following assumption that it based is wrong: // The test file is 2 times the blocksize plus one. This means that when the // entire file is written, the first two blocks definitely get flushed to // the datanodes. When an application returns from writing 2 blocks +1 bytes of data, HDFS does not provide any guarantee where the data is unless hflush is called. It is possible the data still buffer at the client side. FSNameSystem#addStoredBlock does not handle inconsistent block length correctly --- Key: HDFS-145 URL: https://issues.apache.org/jira/browse/HDFS-145 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.21.0 Reporter: Hairong Kuang Assignee: Hairong Kuang Fix For: 0.21.0, 0.22.0 Attachments: corruptionDetect.patch, corruptionDetect1.patch, inconsistentLen.patch, inconsistentLen1.patch, inconsistentLen2.patch Currently NameNode treats either the new replica or existing replicas as corrupt if the new replica's length is inconsistent with NN recorded block length. The correct behavior should be 1. For a block that is not under construction, the new replica should be marked as corrupt if its length is inconsistent (no matter shorter or longer) with the NN recorded block length; 2. For an under construction block, if the new replica's length is shorter than the NN recorded block length, the new replica could be marked as corrupt; if the new replica's length is longer, NN should update its recorded block length. But it should not mark existing replicas as corrupt. This is because NN recorded length for an under construction block does not accurately match the block length on datanode disk. NN should not judge an under construction replica to be corrupt by looking at the inaccurate information: its recorded block length. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-145) FSNameSystem#addStoredBlock does not handle inconsistent block length correctly
[ https://issues.apache.org/jira/browse/HDFS-145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hairong Kuang updated HDFS-145: --- Hadoop Flags: [Reviewed] Status: Patch Available (was: Open) FSNameSystem#addStoredBlock does not handle inconsistent block length correctly --- Key: HDFS-145 URL: https://issues.apache.org/jira/browse/HDFS-145 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.21.0 Reporter: Hairong Kuang Assignee: Hairong Kuang Fix For: 0.21.0, 0.22.0 Attachments: corruptionDetect.patch, corruptionDetect1.patch, inconsistentLen.patch, inconsistentLen1.patch, inconsistentLen2.patch Currently NameNode treats either the new replica or existing replicas as corrupt if the new replica's length is inconsistent with NN recorded block length. The correct behavior should be 1. For a block that is not under construction, the new replica should be marked as corrupt if its length is inconsistent (no matter shorter or longer) with the NN recorded block length; 2. For an under construction block, if the new replica's length is shorter than the NN recorded block length, the new replica could be marked as corrupt; if the new replica's length is longer, NN should update its recorded block length. But it should not mark existing replicas as corrupt. This is because NN recorded length for an under construction block does not accurately match the block length on datanode disk. NN should not judge an under construction replica to be corrupt by looking at the inaccurate information: its recorded block length. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HDFS-892) optionally use Avro for namenode RPC
optionally use Avro for namenode RPC Key: HDFS-892 URL: https://issues.apache.org/jira/browse/HDFS-892 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Doug Cutting Assignee: Doug Cutting Fix For: 0.22.0 It should be possible to configure HDFS so that Avro is used for RPCs to the namenode. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-892) optionally use Avro for namenode RPC
[ https://issues.apache.org/jira/browse/HDFS-892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Cutting updated HDFS-892: -- Attachment: HDFS-892.patch Here's a first version of this. Default behaviour is unchanged. Passing Ant -Dtest.hdfs.rpc.engine=org.apache.hadoop.ipc.AvroRpcEngine will cause Avro reflection to be used on all NN protocols, and Avro-format RPC calls to be made over a Hadoop RPC tunnel. Some but not all tests yet pass in this configuration. The primary intent of this issue is to test that Avro reflect can correctly characterize all of the Namenode's protocols. optionally use Avro for namenode RPC Key: HDFS-892 URL: https://issues.apache.org/jira/browse/HDFS-892 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Doug Cutting Assignee: Doug Cutting Fix For: 0.22.0 Attachments: HDFS-892.patch It should be possible to configure HDFS so that Avro is used for RPCs to the namenode. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-873) DataNode directories as URIs
[ https://issues.apache.org/jira/browse/HDFS-873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated HDFS-873: - Attachment: dataDirURI.patch Makes sense. I added a test case where URI has an authority part in it. Updated patch to the latest trunk. Checked the test failure: TestReadWhileWriting failed because of some lease recovery issues. This is not related to this patch. DataNode directories as URIs Key: HDFS-873 URL: https://issues.apache.org/jira/browse/HDFS-873 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Affects Versions: 0.21.0 Reporter: Konstantin Shvachko Assignee: Konstantin Shvachko Attachments: dataDirURI.patch, dataDirURI.patch Data-node directories {{dfs.datanode.data.dir}} should be specified as URIs in configurations files making it consistent with how name-node directories are set up since HDFS-396. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-873) DataNode directories as URIs
[ https://issues.apache.org/jira/browse/HDFS-873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798908#action_12798908 ] Konstantin Shvachko commented on HDFS-873: -- Should we apply this to 0.21? This is an incompatible change. But this makes configuration of name-node and data-node directories consistent. I am in doubt. DataNode directories as URIs Key: HDFS-873 URL: https://issues.apache.org/jira/browse/HDFS-873 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Affects Versions: 0.21.0 Reporter: Konstantin Shvachko Assignee: Konstantin Shvachko Attachments: dataDirURI.patch, dataDirURI.patch Data-node directories {{dfs.datanode.data.dir}} should be specified as URIs in configurations files making it consistent with how name-node directories are set up since HDFS-396. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HDFS-699) Primary datanode should compare replicas' on disk lengths
[ https://issues.apache.org/jira/browse/HDFS-699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hairong Kuang reassigned HDFS-699: -- Assignee: Hairong Kuang Primary datanode should compare replicas' on disk lengths - Key: HDFS-699 URL: https://issues.apache.org/jira/browse/HDFS-699 Project: Hadoop HDFS Issue Type: Bug Components: data-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Hairong Kuang According to the design, the primary datanode should compare replicas' on disk length but it is current using Block.numBytes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-699) Primary datanode should compare replicas' on disk lengths
[ https://issues.apache.org/jira/browse/HDFS-699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hairong Kuang updated HDFS-699: --- Attachment: HDFS-699.patch This patch let each replica return its length on disk in response to a block recovery init request. Primary datanode should compare replicas' on disk lengths - Key: HDFS-699 URL: https://issues.apache.org/jira/browse/HDFS-699 Project: Hadoop HDFS Issue Type: Bug Components: data-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Hairong Kuang Attachments: HDFS-699.patch According to the design, the primary datanode should compare replicas' on disk length but it is current using Block.numBytes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-699) Primary datanode should compare replicas' on disk lengths
[ https://issues.apache.org/jira/browse/HDFS-699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hairong Kuang updated HDFS-699: --- Attachment: (was: HDFS-699.patch) Primary datanode should compare replicas' on disk lengths - Key: HDFS-699 URL: https://issues.apache.org/jira/browse/HDFS-699 Project: Hadoop HDFS Issue Type: Bug Components: data-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Hairong Kuang According to the design, the primary datanode should compare replicas' on disk length but it is current using Block.numBytes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-768) HDFS Contrib project ivy dependencies are not included in binary target
[ https://issues.apache.org/jira/browse/HDFS-768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Kimball updated HDFS-768: --- Status: Patch Available (was: Open) HDFS Contrib project ivy dependencies are not included in binary target --- Key: HDFS-768 URL: https://issues.apache.org/jira/browse/HDFS-768 Project: Hadoop HDFS Issue Type: Bug Components: build Reporter: Aaron Kimball Assignee: Aaron Kimball Priority: Critical Attachments: HDFS-768.2.patch, HDFS-768.3.patch, HDFS-768.patch As in HADOOP-6370, only Hadoop's own library dependencies are promoted to ${build.dir}/lib; any libraries required by contribs are not redistributed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-699) Primary datanode should compare replicas' on disk lengths
[ https://issues.apache.org/jira/browse/HDFS-699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hairong Kuang updated HDFS-699: --- Attachment: HDFS-699.patch Primary datanode should compare replicas' on disk lengths - Key: HDFS-699 URL: https://issues.apache.org/jira/browse/HDFS-699 Project: Hadoop HDFS Issue Type: Bug Components: data-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Hairong Kuang Attachments: HDFS-699.patch According to the design, the primary datanode should compare replicas' on disk length but it is current using Block.numBytes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-145) FSNameSystem#addStoredBlock does not handle inconsistent block length correctly
[ https://issues.apache.org/jira/browse/HDFS-145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hairong Kuang updated HDFS-145: --- Attachment: corruptionDetect2.patch This patch removes a javac warning. FSNameSystem#addStoredBlock does not handle inconsistent block length correctly --- Key: HDFS-145 URL: https://issues.apache.org/jira/browse/HDFS-145 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.21.0 Reporter: Hairong Kuang Assignee: Hairong Kuang Fix For: 0.21.0, 0.22.0 Attachments: corruptionDetect.patch, corruptionDetect1.patch, corruptionDetect2.patch, inconsistentLen.patch, inconsistentLen1.patch, inconsistentLen2.patch Currently NameNode treats either the new replica or existing replicas as corrupt if the new replica's length is inconsistent with NN recorded block length. The correct behavior should be 1. For a block that is not under construction, the new replica should be marked as corrupt if its length is inconsistent (no matter shorter or longer) with the NN recorded block length; 2. For an under construction block, if the new replica's length is shorter than the NN recorded block length, the new replica could be marked as corrupt; if the new replica's length is longer, NN should update its recorded block length. But it should not mark existing replicas as corrupt. This is because NN recorded length for an under construction block does not accurately match the block length on datanode disk. NN should not judge an under construction replica to be corrupt by looking at the inaccurate information: its recorded block length. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-873) DataNode directories as URIs
[ https://issues.apache.org/jira/browse/HDFS-873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated HDFS-873: - Resolution: Fixed Fix Version/s: 0.21.0 Hadoop Flags: [Incompatible change, Reviewed] Status: Resolved (was: Patch Available) I just committed this. DataNode directories as URIs Key: HDFS-873 URL: https://issues.apache.org/jira/browse/HDFS-873 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Affects Versions: 0.21.0 Reporter: Konstantin Shvachko Assignee: Konstantin Shvachko Fix For: 0.21.0 Attachments: dataDirURI-0-21.patch, dataDirURI.patch, dataDirURI.patch Data-node directories {{dfs.datanode.data.dir}} should be specified as URIs in configurations files making it consistent with how name-node directories are set up since HDFS-396. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-145) FSNameSystem#addStoredBlock does not handle inconsistent block length correctly
[ https://issues.apache.org/jira/browse/HDFS-145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798972#action_12798972 ] Hadoop QA commented on HDFS-145: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12429938/corruptionDetect1.patch against trunk revision 897975. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 25 javac compiler warnings (more than the trunk's current 24 warnings). +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/181/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/181/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/181/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/181/console This message is automatically generated. FSNameSystem#addStoredBlock does not handle inconsistent block length correctly --- Key: HDFS-145 URL: https://issues.apache.org/jira/browse/HDFS-145 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.21.0 Reporter: Hairong Kuang Assignee: Hairong Kuang Fix For: 0.21.0, 0.22.0 Attachments: corruptionDetect.patch, corruptionDetect1.patch, corruptionDetect2.patch, inconsistentLen.patch, inconsistentLen1.patch, inconsistentLen2.patch Currently NameNode treats either the new replica or existing replicas as corrupt if the new replica's length is inconsistent with NN recorded block length. The correct behavior should be 1. For a block that is not under construction, the new replica should be marked as corrupt if its length is inconsistent (no matter shorter or longer) with the NN recorded block length; 2. For an under construction block, if the new replica's length is shorter than the NN recorded block length, the new replica could be marked as corrupt; if the new replica's length is longer, NN should update its recorded block length. But it should not mark existing replicas as corrupt. This is because NN recorded length for an under construction block does not accurately match the block length on datanode disk. NN should not judge an under construction replica to be corrupt by looking at the inaccurate information: its recorded block length. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-850) Display more memory details on the web ui
[ https://issues.apache.org/jira/browse/HDFS-850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmytro Molkov updated HDFS-850: --- Attachment: screenshot-1.jpg What it looks like. Display more memory details on the web ui - Key: HDFS-850 URL: https://issues.apache.org/jira/browse/HDFS-850 Project: Hadoop HDFS Issue Type: Improvement Reporter: Dmytro Molkov Assignee: Dmytro Molkov Priority: Minor Attachments: HDFS-850.patch, screenshot-1.jpg With the HDFS-94 being commited, the namenode will use JMX memory beans to get information about heap usage. They provide us with additional information such as NonHeap memory usage and Heap Commited and Initialized memory in addition to Used and Max. It will be useful to see that additional information on the NameNode web ui. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-145) FSNameSystem#addStoredBlock does not handle inconsistent block length correctly
[ https://issues.apache.org/jira/browse/HDFS-145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hairong Kuang updated HDFS-145: --- Status: Open (was: Patch Available) FSNameSystem#addStoredBlock does not handle inconsistent block length correctly --- Key: HDFS-145 URL: https://issues.apache.org/jira/browse/HDFS-145 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.21.0 Reporter: Hairong Kuang Assignee: Hairong Kuang Fix For: 0.21.0, 0.22.0 Attachments: corruptionDetect.patch, corruptionDetect1.patch, corruptionDetect2.patch, inconsistentLen.patch, inconsistentLen1.patch, inconsistentLen2.patch Currently NameNode treats either the new replica or existing replicas as corrupt if the new replica's length is inconsistent with NN recorded block length. The correct behavior should be 1. For a block that is not under construction, the new replica should be marked as corrupt if its length is inconsistent (no matter shorter or longer) with the NN recorded block length; 2. For an under construction block, if the new replica's length is shorter than the NN recorded block length, the new replica could be marked as corrupt; if the new replica's length is longer, NN should update its recorded block length. But it should not mark existing replicas as corrupt. This is because NN recorded length for an under construction block does not accurately match the block length on datanode disk. NN should not judge an under construction replica to be corrupt by looking at the inaccurate information: its recorded block length. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-145) FSNameSystem#addStoredBlock does not handle inconsistent block length correctly
[ https://issues.apache.org/jira/browse/HDFS-145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hairong Kuang updated HDFS-145: --- Status: Patch Available (was: Open) FSNameSystem#addStoredBlock does not handle inconsistent block length correctly --- Key: HDFS-145 URL: https://issues.apache.org/jira/browse/HDFS-145 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.21.0 Reporter: Hairong Kuang Assignee: Hairong Kuang Fix For: 0.21.0, 0.22.0 Attachments: corruptionDetect.patch, corruptionDetect1.patch, corruptionDetect2.patch, inconsistentLen.patch, inconsistentLen1.patch, inconsistentLen2.patch Currently NameNode treats either the new replica or existing replicas as corrupt if the new replica's length is inconsistent with NN recorded block length. The correct behavior should be 1. For a block that is not under construction, the new replica should be marked as corrupt if its length is inconsistent (no matter shorter or longer) with the NN recorded block length; 2. For an under construction block, if the new replica's length is shorter than the NN recorded block length, the new replica could be marked as corrupt; if the new replica's length is longer, NN should update its recorded block length. But it should not mark existing replicas as corrupt. This is because NN recorded length for an under construction block does not accurately match the block length on datanode disk. NN should not judge an under construction replica to be corrupt by looking at the inaccurate information: its recorded block length. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-768) HDFS Contrib project ivy dependencies are not included in binary target
[ https://issues.apache.org/jira/browse/HDFS-768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798990#action_12798990 ] Hadoop QA commented on HDFS-768: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12429951/HDFS-768.3.patch against trunk revision 897975. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/94/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/94/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/94/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/94/console This message is automatically generated. HDFS Contrib project ivy dependencies are not included in binary target --- Key: HDFS-768 URL: https://issues.apache.org/jira/browse/HDFS-768 Project: Hadoop HDFS Issue Type: Bug Components: build Reporter: Aaron Kimball Assignee: Aaron Kimball Priority: Critical Attachments: HDFS-768.2.patch, HDFS-768.3.patch, HDFS-768.patch As in HADOOP-6370, only Hadoop's own library dependencies are promoted to ${build.dir}/lib; any libraries required by contribs are not redistributed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-724) Pipeline close hangs if one of the datanode is not responsive.
[ https://issues.apache.org/jira/browse/HDFS-724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798993#action_12798993 ] Hudson commented on HDFS-724: - Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #94 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/94/]) Pipeline close hangs if one of the datanode is not responsive. -- Key: HDFS-724 URL: https://issues.apache.org/jira/browse/HDFS-724 Project: Hadoop HDFS Issue Type: Bug Components: data-node, hdfs client Affects Versions: 0.21.0 Reporter: Tsz Wo (Nicholas), SZE Assignee: Hairong Kuang Priority: Blocker Fix For: 0.21.0, 0.22.0 Attachments: h724_20091021.patch, pipelineHeartbeat.patch, pipelineHeartbeat1.patch, pipelineHeartbeat2.patch In the new pipeline design, pipeline close is implemented by sending an additional empty packet. If one of the datanode does not response to this empty packet, the pipeline hangs. It seems that there is no timeout. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-101) DFS write pipeline : DFSClient sometimes does not detect second datanode failure
[ https://issues.apache.org/jira/browse/HDFS-101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798992#action_12798992 ] Hudson commented on HDFS-101: - Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #94 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/94/]) DFS write pipeline : DFSClient sometimes does not detect second datanode failure - Key: HDFS-101 URL: https://issues.apache.org/jira/browse/HDFS-101 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.20.1 Reporter: Raghu Angadi Assignee: Hairong Kuang Priority: Blocker Fix For: 0.20.2, 0.21.0, 0.22.0 Attachments: detectDownDN-0.20.patch, detectDownDN1-0.20.patch, detectDownDN2.patch, detectDownDN3-0.20.patch, detectDownDN3.patch, hdfs-101.tar.gz When the first datanode's write to second datanode fails or times out DFSClient ends up marking first datanode as the bad one and removes it from the pipeline. Similar problem exists on DataNode as well and it is fixed in HADOOP-3339. From HADOOP-3339 : The main issue is that BlockReceiver thread (and DataStreamer in the case of DFSClient) interrupt() the 'responder' thread. But interrupting is a pretty coarse control. We don't know what state the responder is in and interrupting has different effects depending on responder state. To fix this properly we need to redesign how we handle these interactions. When the first datanode closes its socket from DFSClient, DFSClient should properly read all the data left in the socket.. Also, DataNode's closing of the socket should not result in a TCP reset, otherwise I think DFSClient will not be able to read from the socket. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-630) In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific datanodes when locating the next block.
[ https://issues.apache.org/jira/browse/HDFS-630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798994#action_12798994 ] Hudson commented on HDFS-630: - Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #94 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/94/]) In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific datanodes when locating the next block. --- Key: HDFS-630 URL: https://issues.apache.org/jira/browse/HDFS-630 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs client, name-node Affects Versions: 0.21.0 Reporter: Ruyue Ma Assignee: Cosmin Lehene Attachments: 0001-Fix-HDFS-630-0.21-svn-1.patch, 0001-Fix-HDFS-630-0.21-svn.patch, 0001-Fix-HDFS-630-for-0.21-and-trunk-unified.patch, 0001-Fix-HDFS-630-for-0.21.patch, 0001-Fix-HDFS-630-svn.patch, 0001-Fix-HDFS-630-svn.patch, 0001-Fix-HDFS-630-trunk-svn-1.patch, 0001-Fix-HDFS-630-trunk-svn-2.patch, 0001-Fix-HDFS-630-trunk-svn-3.patch, 0001-Fix-HDFS-630-trunk-svn-3.patch, HDFS-630.patch created from hdfs-200. If during a write, the dfsclient sees that a block replica location for a newly allocated block is not-connectable, it re-requests the NN to get a fresh set of replica locations of the block. It tries this dfs.client.block.write.retries times (default 3), sleeping 6 seconds between each retry ( see DFSClient.nextBlockOutputStream). This setting works well when you have a reasonable size cluster; if u have few datanodes in the cluster, every retry maybe pick the dead-datanode and the above logic bails out. Our solution: when getting block location from namenode, we give nn the excluded datanodes. The list of dead datanodes is only for one block allocation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-456) Problems with dfs.name.edits.dirs as URI
[ https://issues.apache.org/jira/browse/HDFS-456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798995#action_12798995 ] Hudson commented on HDFS-456: - Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #94 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/94/]) Problems with dfs.name.edits.dirs as URI Key: HDFS-456 URL: https://issues.apache.org/jira/browse/HDFS-456 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.21.0 Reporter: Konstantin Shvachko Assignee: Konstantin Shvachko Priority: Blocker Fix For: 0.21.0 Attachments: EditsDirsWin.patch, EditsDirsWin.patch, EditsDirsWin.patch, failing-tests.zip, HDFS-456-clean.patch, HDFS-456.patch, HDFS-456.patch, HDFS-456.patch, HDFS-456.patch, HDFS-456.patch, HDFS-456.patch, HDFS-456.patch, HDFS-456.patch There are several problems with recent commit of HDFS-396. # It does not work with default configuration file:///. Throws {{IllegalArgumentException}}. # *ALL* hdfs tests fail on Windows because C:\mypath is treated as an illegal URI. Backward compatibility is not provided. # {{IllegalArgumentException}} should not be thrown within hdfs code because it is a {{RuntimException}}. We should throw {{IOException}} instead. This was recently discussed in another jira. # Why do we commit patches without running unit tests and test-patch? This is the minimum requirement for a patch to qualify as committable, right? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-885) Datanode toString() NPEs on null dnRegistration
[ https://issues.apache.org/jira/browse/HDFS-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798996#action_12798996 ] Hudson commented on HDFS-885: - Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #94 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/94/]) . Datanode toString() NPEs on null dnRegistration. Datanode toString() NPEs on null dnRegistration --- Key: HDFS-885 URL: https://issues.apache.org/jira/browse/HDFS-885 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.22.0 Reporter: Steve Loughran Assignee: Steve Loughran Priority: Minor Fix For: 0.22.0 Attachments: HDFS-885.patch {{Datanode toString() }} assumes the {{dnRegistration}} value is never null. This is not always true, and when it is not the case, the {{toString()}} operator NPEs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-786) Implement getContentSummary(..) in HftpFileSystem
[ https://issues.apache.org/jira/browse/HDFS-786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798998#action_12798998 ] Hudson commented on HDFS-786: - Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #94 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/94/]) Implement getContentSummary(..) in HftpFileSystem - Key: HDFS-786 URL: https://issues.apache.org/jira/browse/HDFS-786 Project: Hadoop HDFS Issue Type: Improvement Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.22.0 Attachments: h786_20091223.patch, h786_20091224.patch, h786_20100104.patch, h786_20100106.patch HftpFileSystem does not override getContentSummary(..). As a result, it uses FileSystem's default implementation, which computes content summary on the client side by calling listStatus(..) recursively. In contrast, DistributedFileSystem has overridden getContentSummary(..) and does the computation on the NameNode. As a result, running fs -dus on hftp is much slower than running it on hdfs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-767) Job failure due to BlockMissingException
[ https://issues.apache.org/jira/browse/HDFS-767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798997#action_12798997 ] Hudson commented on HDFS-767: - Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #94 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/94/]) Job failure due to BlockMissingException Key: HDFS-767 URL: https://issues.apache.org/jira/browse/HDFS-767 Project: Hadoop HDFS Issue Type: Improvement Reporter: Ning Zhang Assignee: Ning Zhang Fix For: 0.22.0 Attachments: HDFS-767.patch, HDFS-767_2.patch, HDFS-767_3.patch, HDFS-767_4.txt If a block is request by too many mappers/reducers (say, 3000) at the same time, a BlockMissingException is thrown because it exceeds the upper limit (I think 256 by default) of number of threads accessing the same block at the same time. The DFSClient wil catch that exception and retry 3 times after waiting for 3 seconds. Since the wait time is a fixed value, a lot of clients will retry at about the same time and a large portion of them get another failure. After 3 retries, there are about 256*4 = 1024 clients got the block. If the number of clients are more than that, the job will fail. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-880) TestNNLeaseRecovery fails on windows
[ https://issues.apache.org/jira/browse/HDFS-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799004#action_12799004 ] Hudson commented on HDFS-880: - Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #94 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/94/]) . TestNNLeaseRecovery fails on windows. Contributed by Konstantin Boudnik, Konstantin Shvachko. TestNNLeaseRecovery fails on windows Key: HDFS-880 URL: https://issues.apache.org/jira/browse/HDFS-880 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.21.0 Reporter: Konstantin Shvachko Assignee: Konstantin Boudnik Fix For: 0.21.0 Attachments: HDFS-880.patch, testNNLeaseRecovery.patch TestNNLeaseRecovery fails on windows trying to delete name-node storage directory. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-812) FSNamesystem#internalReleaseLease throws NullPointerException on a single-block file's lease recovery
[ https://issues.apache.org/jira/browse/HDFS-812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799003#action_12799003 ] Hudson commented on HDFS-812: - Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #94 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/94/]) FSNamesystem#internalReleaseLease throws NullPointerException on a single-block file's lease recovery - Key: HDFS-812 URL: https://issues.apache.org/jira/browse/HDFS-812 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.21.0, 0.22.0 Reporter: Konstantin Boudnik Assignee: Konstantin Boudnik Priority: Blocker Fix For: 0.21.0, 0.22.0 Attachments: HDFS-812.patch, HDFS-812.patch, HDFS-812.patch, HDFS-812.patch, HDFS-812.patch {{FSNamesystem.internalReleaseLease()}} uses the result of {{iFile#numBlocks();}} call to get a number of an under construction file's blocks. {{numBlock()}} can return 0 if the file doesn't have any blocks yet. This will cause {{internalReleaseLease()}} to throw ArrayIndexOutOfBoundException. In case of a single block file, the same method will throw NullPointerException because the penultimate block is going to be null according to the logic of INodeFile#getPenultimateBlock(). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-762) Trying to start the balancer throws a NPE
[ https://issues.apache.org/jira/browse/HDFS-762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799000#action_12799000 ] Hudson commented on HDFS-762: - Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #94 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/94/]) Trying to start the balancer throws a NPE - Key: HDFS-762 URL: https://issues.apache.org/jira/browse/HDFS-762 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.21.0 Reporter: Cristian Ivascu Assignee: Cristian Ivascu Fix For: 0.21.0 Attachments: 0001-corrected-balancer-constructor.patch, HDFS-762.patch When trying to run the balancer, I get a NullPointerException: 2009-11-10 11:08:14,235 ERROR org.apache.hadoop.hdfs.server.balancer.Balancer: java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.BlockPlacementPolicy.getInstance(BlockPlacementPolicy.java:161) at org.apache.hadoop.hdfs.server.balancer.Balancer.checkReplicationPolicyCompatibility(Balancer.java:784) at org.apache.hadoop.hdfs.server.balancer.Balancer.init(Balancer.java:792) at org.apache.hadoop.hdfs.server.balancer.Balancer.main(Balancer.java:814) This happens when trying to use bin/start-balancer or bin/hdfs balancer -threshold 10 The config files (hdfs-site and core-site) have as fs.default.name hdfs://namenode:9000. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-832) HDFS side of HADOOP-6222.
[ https://issues.apache.org/jira/browse/HDFS-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12798999#action_12798999 ] Hudson commented on HDFS-832: - Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #94 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/94/]) HDFS side of HADOOP-6222. -- Key: HDFS-832 URL: https://issues.apache.org/jira/browse/HDFS-832 Project: Hadoop HDFS Issue Type: Improvement Components: test Reporter: Konstantin Boudnik Assignee: Konstantin Boudnik Fix For: 0.22.0 Attachments: HDFS-832.patch This is for patch tracking of HDFS part of HADOOP-6222 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-814) Add an api to get the visible length of a DFSDataInputStream.
[ https://issues.apache.org/jira/browse/HDFS-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799002#action_12799002 ] Hudson commented on HDFS-814: - Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #94 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/94/]) Add an api to get the visible length of a DFSDataInputStream. - Key: HDFS-814 URL: https://issues.apache.org/jira/browse/HDFS-814 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs client Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.21.0, 0.22.0 Attachments: h814_20091221.patch, h814_20091221_0.21.patch Hflush guarantees that the bytes written before are visible to the new readers. However, there is no way to get the length of the visible bytes. The visible length is useful in some applications like SequenceFile. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-187) TestStartup fails if hdfs is running in the same machine
[ https://issues.apache.org/jira/browse/HDFS-187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799001#action_12799001 ] Hudson commented on HDFS-187: - Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #94 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/94/]) TestStartup fails if hdfs is running in the same machine Key: HDFS-187 URL: https://issues.apache.org/jira/browse/HDFS-187 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.21.0, 0.22.0 Reporter: Tsz Wo (Nicholas), SZE Assignee: Todd Lipcon Fix For: 0.22.0 Attachments: hdfs-187.txt Reproduce this: - ./bin/start-dfs - ant test-core -Dtestcase=TestStartup The test may fail. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-775) FSDataset calls getCapacity() twice -bug?
[ https://issues.apache.org/jira/browse/HDFS-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799006#action_12799006 ] Hudson commented on HDFS-775: - Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #94 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/94/]) FSDataset calls getCapacity() twice -bug? - Key: HDFS-775 URL: https://issues.apache.org/jira/browse/HDFS-775 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.22.0 Reporter: Steve Loughran Assignee: Steve Loughran Priority: Minor Fix For: 0.22.0 Attachments: HDFS-775-1.patch, HDFS-775-2.patch I'm not sure this is a bug or as intended, but I thought I'd mention it. FSDataset.getCapacity() calls DF.getCapacity() twice, when evaluating its capacity. Although there is caching to stop the shell being exec'd twice in a row, there is a risk that the first call doesn't run the shell, and the second does -so the value changes during the method. If that is not intended, it is better to cache the first value for the whole method -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-755) Read multiple checksum chunks at once in DFSInputStream
[ https://issues.apache.org/jira/browse/HDFS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799008#action_12799008 ] Hudson commented on HDFS-755: - Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #94 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/94/]) Read multiple checksum chunks at once in DFSInputStream --- Key: HDFS-755 URL: https://issues.apache.org/jira/browse/HDFS-755 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs client Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.22.0 Attachments: alldata-hdfs.tsv, benchmark-8-256.png, benchmark.png, hdfs-755.txt, hdfs-755.txt, hdfs-755.txt, hdfs-755.txt, hdfs-755.txt, hdfs-755.txt HADOOP-3205 adds the ability for FSInputChecker subclasses to read multiple checksum chunks in a single call to readChunk. This is the HDFS-side use of that new feature. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-94) The Heap Size in HDFS web ui may not be accurate
[ https://issues.apache.org/jira/browse/HDFS-94?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799010#action_12799010 ] Hudson commented on HDFS-94: Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #94 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/94/]) The Heap Size in HDFS web ui may not be accurate -- Key: HDFS-94 URL: https://issues.apache.org/jira/browse/HDFS-94 Project: Hadoop HDFS Issue Type: Bug Reporter: Tsz Wo (Nicholas), SZE Assignee: Dmytro Molkov Fix For: 0.22.0 Attachments: HDFS-94.patch It seems that the Heap Size shown in HDFS web UI is not accurate. It keeps showing 100% of usage. e.g. {noformat} Heap Size is 10.01 GB / 10.01 GB (100%) {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-849) TestFiDataTransferProtocol2#pipeline_Fi_18 sometimes fails
[ https://issues.apache.org/jira/browse/HDFS-849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799012#action_12799012 ] Hudson commented on HDFS-849: - Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #94 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/94/]) TestFiDataTransferProtocol2#pipeline_Fi_18 sometimes fails -- Key: HDFS-849 URL: https://issues.apache.org/jira/browse/HDFS-849 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.20.1 Reporter: Hairong Kuang Assignee: Hairong Kuang Fix For: 0.21.0, 0.22.0 Attachments: countDown.patch .TestFiDataTransferProtocol2#pipeline_Fi_18 sometimes fails with the following error: junit.framework.AssertionFailedError: at org.apache.hadoop.hdfs.server.datanode.TestFiDataTransferProtocol2.runTest17_19(TestFiDataTransferProtocol2.java:139) at org.apache.hadoop.hdfs.server.datanode.TestFiDataTransferProtocol2.pipeline_Fi_18(TestFiDataTransferProtocol2.java:186) Which means that the test did not trigger pipeline recovery. The test log shows that there is no fault injected to the pipeline. It turns out there is a bug in the test code. Counting down 3 means inject a fault when receiving the fourth packet. But the code allows the file to have only 3 packets. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-564) Adding pipeline test 17-35
[ https://issues.apache.org/jira/browse/HDFS-564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799005#action_12799005 ] Hudson commented on HDFS-564: - Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #94 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/94/]) Adding pipeline test 17-35 -- Key: HDFS-564 URL: https://issues.apache.org/jira/browse/HDFS-564 Project: Hadoop HDFS Issue Type: Sub-task Components: test Affects Versions: 0.21.0 Reporter: Kan Zhang Assignee: Hairong Kuang Priority: Blocker Fix For: 0.21.0, 0.22.0 Attachments: h564-24.patch, h564-25.patch, pipelineTests.patch, pipelineTests1.patch, pipelineTests2.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-685) Use the user-to-groups mapping service in the NameNode
[ https://issues.apache.org/jira/browse/HDFS-685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799007#action_12799007 ] Hudson commented on HDFS-685: - Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #94 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/94/]) Use the user-to-groups mapping service in the NameNode -- Key: HDFS-685 URL: https://issues.apache.org/jira/browse/HDFS-685 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Arun C Murthy Assignee: Boris Shkolnik Fix For: 0.22.0 Attachments: HADOOP-4656_hdfs.patch, HDFS-685-4.patch, HDFS-685-5.patch, HDFS-685-6.patch, MR-1083-0_20.2.patch HADOOP-4656 introduces a user-to-groups mapping service on the server-side. The NameNode should use this to map users to their groups rather than relying on the information passed by the client. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-868) Link to Hadoop Upgrade Wiki is broken
[ https://issues.apache.org/jira/browse/HDFS-868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799011#action_12799011 ] Hudson commented on HDFS-868: - Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #94 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/94/]) Link to Hadoop Upgrade Wiki is broken - Key: HDFS-868 URL: https://issues.apache.org/jira/browse/HDFS-868 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 0.20.1 Environment: Browsing the web to the current Hadoop HDFS documentation, http://hadoop.apache.org/common/docs/r0.20.0/hdfs_user_guide.html#Upgrade+and+Rollback (also verified as far back as 0.17.2 has this issue). Reporter: Chris A. Mattmann Priority: Trivial Fix For: 0.21.0 Attachments: HDFS-868.Mattmann.010210.2.patch.txt, HDFS-868.Mattmann.010210.patch.txt The link to the Hadoop Upgrade wiki is broken in the xdocs. Trivial patch forthcoming which addresses the issue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-840) Update File Context tests to use FileContextTestHelper
[ https://issues.apache.org/jira/browse/HDFS-840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799009#action_12799009 ] Hudson commented on HDFS-840: - Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #94 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/94/]) Update File Context tests to use FileContextTestHelper -- Key: HDFS-840 URL: https://issues.apache.org/jira/browse/HDFS-840 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.21.0, 0.22.0 Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Fix For: 0.22.0 Attachments: HDFS-840.1.patch The FileContext tests in HDFS must be updated to use FileContextTestHelper, otherwise build will fail due to changes in HADOOP-6394. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-823) In Checkpointer the getImage servlet is added to public rather than internal servlet list
[ https://issues.apache.org/jira/browse/HDFS-823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799013#action_12799013 ] Hudson commented on HDFS-823: - Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #94 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/94/]) In Checkpointer the getImage servlet is added to public rather than internal servlet list - Key: HDFS-823 URL: https://issues.apache.org/jira/browse/HDFS-823 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.21.0, 0.22.0 Reporter: Jakob Homan Assignee: Jakob Homan Fix For: 0.21.0, 0.22.0 Attachments: HDFS-823.patch Checkpointer.java:99 {code} httpServer.addServlet(getimage, /getimage, GetImageServlet.class);{code} This should be addInternalServlet, as it is for Namenode to ensure this servlet does not get filtered. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-825) Build fails to pull latest hadoop-core-* artifacts
[ https://issues.apache.org/jira/browse/HDFS-825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799015#action_12799015 ] Hudson commented on HDFS-825: - Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #94 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/94/]) Build fails to pull latest hadoop-core-* artifacts -- Key: HDFS-825 URL: https://issues.apache.org/jira/browse/HDFS-825 Project: Hadoop HDFS Issue Type: Bug Components: build Affects Versions: 0.21.0, 0.22.0 Reporter: Konstantin Boudnik Assignee: Konstantin Boudnik Priority: Critical Fix For: 0.22.0 Attachments: HDFS-825.patch, latest-mod-check.patch I've noticed on more than one occasion that Ivy cache has staled Common SNAPSHOT jar files. In some cases I've seen more than a month old files. In fact, some very bad problems weren't tested at all, because changes in Common weren't pulled into, say, HDFS where the tests had to be executed. I've noticed the same problem with MapReduce just today: latest cached snapshot of Common was a week old. One can run clean ivy cache to make sure that latest versions are pulled down. However, it's inconvenient and undesirable to do every time. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-101) DFS write pipeline : DFSClient sometimes does not detect second datanode failure
[ https://issues.apache.org/jira/browse/HDFS-101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799019#action_12799019 ] Alex Loddengaard commented on HDFS-101: --- I will be out of the office Thursday, 1/7, through Wednesday, 1/13, back in the office Thursday, 1/14. I will be checking email fairly consistently in the evenings. Please contact Christophe Bisciglia (christo...@cloudera.com) with any support or training emergencies. Otherwise, you'll hear from me soon. Thanks, Alex DFS write pipeline : DFSClient sometimes does not detect second datanode failure - Key: HDFS-101 URL: https://issues.apache.org/jira/browse/HDFS-101 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.20.1 Reporter: Raghu Angadi Assignee: Hairong Kuang Priority: Blocker Fix For: 0.20.2, 0.21.0, 0.22.0 Attachments: detectDownDN-0.20.patch, detectDownDN1-0.20.patch, detectDownDN2.patch, detectDownDN3-0.20.patch, detectDownDN3.patch, hdfs-101.tar.gz When the first datanode's write to second datanode fails or times out DFSClient ends up marking first datanode as the bad one and removes it from the pipeline. Similar problem exists on DataNode as well and it is fixed in HADOOP-3339. From HADOOP-3339 : The main issue is that BlockReceiver thread (and DataStreamer in the case of DFSClient) interrupt() the 'responder' thread. But interrupting is a pretty coarse control. We don't know what state the responder is in and interrupting has different effects depending on responder state. To fix this properly we need to redesign how we handle these interactions. When the first datanode closes its socket from DFSClient, DFSClient should properly read all the data left in the socket.. Also, DataNode's closing of the socket should not result in a TCP reset, otherwise I think DFSClient will not be able to read from the socket. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-850) Display more memory details on the web ui
[ https://issues.apache.org/jira/browse/HDFS-850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799025#action_12799025 ] Suresh Srinivas commented on HDFS-850: -- This information will be useful. # Can you print the memory usage in separate lines from files and block counts. # Current format is: Heap Size used / max. Committed Heap is committed. Using the following format would better explain the information: Heap memory used used is % of committed memory commited. Maximum heap memory is max. This also calculates the % usage from the committed memory (which is what is currently available) instead of maximum size the process can grow to. Display more memory details on the web ui - Key: HDFS-850 URL: https://issues.apache.org/jira/browse/HDFS-850 Project: Hadoop HDFS Issue Type: Improvement Reporter: Dmytro Molkov Assignee: Dmytro Molkov Priority: Minor Attachments: HDFS-850.patch, screenshot-1.jpg With the HDFS-94 being commited, the namenode will use JMX memory beans to get information about heap usage. They provide us with additional information such as NonHeap memory usage and Heap Commited and Initialized memory in addition to Used and Max. It will be useful to see that additional information on the NameNode web ui. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-145) FSNameSystem#addStoredBlock does not handle inconsistent block length correctly
[ https://issues.apache.org/jira/browse/HDFS-145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799035#action_12799035 ] Hadoop QA commented on HDFS-145: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12429954/corruptionDetect2.patch against trunk revision 898134. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/182/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/182/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/182/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/182/console This message is automatically generated. FSNameSystem#addStoredBlock does not handle inconsistent block length correctly --- Key: HDFS-145 URL: https://issues.apache.org/jira/browse/HDFS-145 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.21.0 Reporter: Hairong Kuang Assignee: Hairong Kuang Fix For: 0.21.0, 0.22.0 Attachments: corruptionDetect.patch, corruptionDetect1.patch, corruptionDetect2.patch, inconsistentLen.patch, inconsistentLen1.patch, inconsistentLen2.patch Currently NameNode treats either the new replica or existing replicas as corrupt if the new replica's length is inconsistent with NN recorded block length. The correct behavior should be 1. For a block that is not under construction, the new replica should be marked as corrupt if its length is inconsistent (no matter shorter or longer) with the NN recorded block length; 2. For an under construction block, if the new replica's length is shorter than the NN recorded block length, the new replica could be marked as corrupt; if the new replica's length is longer, NN should update its recorded block length. But it should not mark existing replicas as corrupt. This is because NN recorded length for an under construction block does not accurately match the block length on datanode disk. NN should not judge an under construction replica to be corrupt by looking at the inaccurate information: its recorded block length. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-873) DataNode directories as URIs
[ https://issues.apache.org/jira/browse/HDFS-873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799037#action_12799037 ] Hudson commented on HDFS-873: - Integrated in Hadoop-Hdfs-trunk-Commit #166 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/166/]) Remove duplicate lines introduced by . . Configuration specifies data-node storage directories as URIs. Contributed by Konstantin Shvachko. DataNode directories as URIs Key: HDFS-873 URL: https://issues.apache.org/jira/browse/HDFS-873 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Affects Versions: 0.21.0 Reporter: Konstantin Shvachko Assignee: Konstantin Shvachko Fix For: 0.21.0 Attachments: dataDirURI-0-21.patch, dataDirURI.patch, dataDirURI.patch Data-node directories {{dfs.datanode.data.dir}} should be specified as URIs in configurations files making it consistent with how name-node directories are set up since HDFS-396. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-245) Create symbolic links in HDFS
[ https://issues.apache.org/jira/browse/HDFS-245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-245: - Attachment: symlink31-hdfs.patch Latest patch attached. It addresses feedback from Sanjay's review of the common patch on HADOOP-6421, mostly just pulls the bulk of the symlink tests into a common class. Create symbolic links in HDFS - Key: HDFS-245 URL: https://issues.apache.org/jira/browse/HDFS-245 Project: Hadoop HDFS Issue Type: New Feature Reporter: dhruba borthakur Assignee: Eli Collins Attachments: 4044_20081030spi.java, designdocv1.txt, designdocv2.txt, designdocv3.txt, HADOOP-4044-strawman.patch, symlink-0.20.0.patch, symlink-25-hdfs.patch, symlink-26-hdfs.patch, symlink-26-hdfs.patch, symLink1.patch, symLink1.patch, symLink11.patch, symLink12.patch, symLink13.patch, symLink14.patch, symLink15.txt, symLink15.txt, symlink16-common.patch, symlink16-hdfs.patch, symlink16-mr.patch, symlink17-common.txt, symlink17-hdfs.txt, symlink18-common.txt, symlink19-common-delta.patch, symlink19-common.txt, symlink19-common.txt, symlink19-hdfs-delta.patch, symlink19-hdfs.txt, symlink20-common.patch, symlink20-hdfs.patch, symlink21-common.patch, symlink21-hdfs.patch, symlink22-common.patch, symlink22-hdfs.patch, symlink23-common.patch, symlink23-hdfs.patch, symlink24-hdfs.patch, symlink27-hdfs.patch, symlink28-hdfs.patch, symlink29-hdfs.patch, symlink29-hdfs.patch, symlink30-hdfs.patch, symlink31-hdfs.patch, symLink4.patch, symLink5.patch, symLink6.patch, symLink8.patch, symLink9.patch HDFS should support symbolic links. A symbolic link is a special type of file that contains a reference to another file or directory in the form of an absolute or relative path and that affects pathname resolution. Programs which read or write to files named by a symbolic link will behave as if operating directly on the target file. However, archiving utilities can handle symbolic links specially and manipulate them directly. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-850) Display more memory details on the web ui
[ https://issues.apache.org/jira/browse/HDFS-850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmytro Molkov updated HDFS-850: --- Attachment: HDFS-850.patch Updated patch addressing comments by Suresh Display more memory details on the web ui - Key: HDFS-850 URL: https://issues.apache.org/jira/browse/HDFS-850 Project: Hadoop HDFS Issue Type: Improvement Reporter: Dmytro Molkov Assignee: Dmytro Molkov Priority: Minor Attachments: HDFS-850.patch, HDFS-850.patch, screenshot-1.jpg, screenshot-2.jpg With the HDFS-94 being commited, the namenode will use JMX memory beans to get information about heap usage. They provide us with additional information such as NonHeap memory usage and Heap Commited and Initialized memory in addition to Used and Max. It will be useful to see that additional information on the NameNode web ui. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-850) Display more memory details on the web ui
[ https://issues.apache.org/jira/browse/HDFS-850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmytro Molkov updated HDFS-850: --- Attachment: screenshot-2.jpg Updated view Display more memory details on the web ui - Key: HDFS-850 URL: https://issues.apache.org/jira/browse/HDFS-850 Project: Hadoop HDFS Issue Type: Improvement Reporter: Dmytro Molkov Assignee: Dmytro Molkov Priority: Minor Attachments: HDFS-850.patch, HDFS-850.patch, screenshot-1.jpg, screenshot-2.jpg With the HDFS-94 being commited, the namenode will use JMX memory beans to get information about heap usage. They provide us with additional information such as NonHeap memory usage and Heap Commited and Initialized memory in addition to Used and Max. It will be useful to see that additional information on the NameNode web ui. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-145) FSNameSystem#addStoredBlock does not handle inconsistent block length correctly
[ https://issues.apache.org/jira/browse/HDFS-145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hairong Kuang updated HDFS-145: --- Status: Open (was: Patch Available) FSNameSystem#addStoredBlock does not handle inconsistent block length correctly --- Key: HDFS-145 URL: https://issues.apache.org/jira/browse/HDFS-145 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.21.0 Reporter: Hairong Kuang Assignee: Hairong Kuang Fix For: 0.21.0, 0.22.0 Attachments: corruptionDetect.patch, corruptionDetect1.patch, corruptionDetect2.patch, inconsistentLen.patch, inconsistentLen1.patch, inconsistentLen2.patch Currently NameNode treats either the new replica or existing replicas as corrupt if the new replica's length is inconsistent with NN recorded block length. The correct behavior should be 1. For a block that is not under construction, the new replica should be marked as corrupt if its length is inconsistent (no matter shorter or longer) with the NN recorded block length; 2. For an under construction block, if the new replica's length is shorter than the NN recorded block length, the new replica could be marked as corrupt; if the new replica's length is longer, NN should update its recorded block length. But it should not mark existing replicas as corrupt. This is because NN recorded length for an under construction block does not accurately match the block length on datanode disk. NN should not judge an under construction replica to be corrupt by looking at the inaccurate information: its recorded block length. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-145) FSNameSystem#addStoredBlock does not handle inconsistent block length correctly
[ https://issues.apache.org/jira/browse/HDFS-145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hairong Kuang updated HDFS-145: --- Status: Patch Available (was: Open) FSNameSystem#addStoredBlock does not handle inconsistent block length correctly --- Key: HDFS-145 URL: https://issues.apache.org/jira/browse/HDFS-145 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.21.0 Reporter: Hairong Kuang Assignee: Hairong Kuang Fix For: 0.21.0, 0.22.0 Attachments: corruptionDetect.patch, corruptionDetect1.patch, corruptionDetect2.patch, inconsistentLen.patch, inconsistentLen1.patch, inconsistentLen2.patch Currently NameNode treats either the new replica or existing replicas as corrupt if the new replica's length is inconsistent with NN recorded block length. The correct behavior should be 1. For a block that is not under construction, the new replica should be marked as corrupt if its length is inconsistent (no matter shorter or longer) with the NN recorded block length; 2. For an under construction block, if the new replica's length is shorter than the NN recorded block length, the new replica could be marked as corrupt; if the new replica's length is longer, NN should update its recorded block length. But it should not mark existing replicas as corrupt. This is because NN recorded length for an under construction block does not accurately match the block length on datanode disk. NN should not judge an under construction replica to be corrupt by looking at the inaccurate information: its recorded block length. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-699) Primary datanode should compare replicas' on disk lengths
[ https://issues.apache.org/jira/browse/HDFS-699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-699: Hadoop Flags: [Reviewed] +1 patch looks good. Primary datanode should compare replicas' on disk lengths - Key: HDFS-699 URL: https://issues.apache.org/jira/browse/HDFS-699 Project: Hadoop HDFS Issue Type: Bug Components: data-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Hairong Kuang Attachments: HDFS-699.patch According to the design, the primary datanode should compare replicas' on disk length but it is current using Block.numBytes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-145) FSNameSystem#addStoredBlock does not handle inconsistent block length correctly
[ https://issues.apache.org/jira/browse/HDFS-145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799103#action_12799103 ] Hadoop QA commented on HDFS-145: +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12429954/corruptionDetect2.patch against trunk revision 898134. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/183/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/183/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/183/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/183/console This message is automatically generated. FSNameSystem#addStoredBlock does not handle inconsistent block length correctly --- Key: HDFS-145 URL: https://issues.apache.org/jira/browse/HDFS-145 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.21.0 Reporter: Hairong Kuang Assignee: Hairong Kuang Fix For: 0.21.0, 0.22.0 Attachments: corruptionDetect.patch, corruptionDetect1.patch, corruptionDetect2.patch, inconsistentLen.patch, inconsistentLen1.patch, inconsistentLen2.patch Currently NameNode treats either the new replica or existing replicas as corrupt if the new replica's length is inconsistent with NN recorded block length. The correct behavior should be 1. For a block that is not under construction, the new replica should be marked as corrupt if its length is inconsistent (no matter shorter or longer) with the NN recorded block length; 2. For an under construction block, if the new replica's length is shorter than the NN recorded block length, the new replica could be marked as corrupt; if the new replica's length is longer, NN should update its recorded block length. But it should not mark existing replicas as corrupt. This is because NN recorded length for an under construction block does not accurately match the block length on datanode disk. NN should not judge an under construction replica to be corrupt by looking at the inaccurate information: its recorded block length. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-873) DataNode directories as URIs
[ https://issues.apache.org/jira/browse/HDFS-873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799104#action_12799104 ] Hudson commented on HDFS-873: - Integrated in Hdfs-Patch-h5.grid.sp2.yahoo.net #183 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/183/]) DataNode directories as URIs Key: HDFS-873 URL: https://issues.apache.org/jira/browse/HDFS-873 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Affects Versions: 0.21.0 Reporter: Konstantin Shvachko Assignee: Konstantin Shvachko Fix For: 0.21.0 Attachments: dataDirURI-0-21.patch, dataDirURI.patch, dataDirURI.patch Data-node directories {{dfs.datanode.data.dir}} should be specified as URIs in configurations files making it consistent with how name-node directories are set up since HDFS-396. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-786) Implement getContentSummary(..) in HftpFileSystem
[ https://issues.apache.org/jira/browse/HDFS-786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799107#action_12799107 ] Hudson commented on HDFS-786: - Integrated in Hdfs-Patch-h5.grid.sp2.yahoo.net #183 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/183/]) Implement getContentSummary(..) in HftpFileSystem - Key: HDFS-786 URL: https://issues.apache.org/jira/browse/HDFS-786 Project: Hadoop HDFS Issue Type: Improvement Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.22.0 Attachments: h786_20091223.patch, h786_20091224.patch, h786_20100104.patch, h786_20100106.patch HftpFileSystem does not override getContentSummary(..). As a result, it uses FileSystem's default implementation, which computes content summary on the client side by calling listStatus(..) recursively. In contrast, DistributedFileSystem has overridden getContentSummary(..) and does the computation on the NameNode. As a result, running fs -dus on hftp is much slower than running it on hdfs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-885) Datanode toString() NPEs on null dnRegistration
[ https://issues.apache.org/jira/browse/HDFS-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799105#action_12799105 ] Hudson commented on HDFS-885: - Integrated in Hdfs-Patch-h5.grid.sp2.yahoo.net #183 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/183/]) Datanode toString() NPEs on null dnRegistration --- Key: HDFS-885 URL: https://issues.apache.org/jira/browse/HDFS-885 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.22.0 Reporter: Steve Loughran Assignee: Steve Loughran Priority: Minor Fix For: 0.22.0 Attachments: HDFS-885.patch {{Datanode toString() }} assumes the {{dnRegistration}} value is never null. This is not always true, and when it is not the case, the {{toString()}} operator NPEs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-880) TestNNLeaseRecovery fails on windows
[ https://issues.apache.org/jira/browse/HDFS-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799106#action_12799106 ] Hudson commented on HDFS-880: - Integrated in Hdfs-Patch-h5.grid.sp2.yahoo.net #183 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/183/]) TestNNLeaseRecovery fails on windows Key: HDFS-880 URL: https://issues.apache.org/jira/browse/HDFS-880 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.21.0 Reporter: Konstantin Shvachko Assignee: Konstantin Boudnik Fix For: 0.21.0 Attachments: HDFS-880.patch, testNNLeaseRecovery.patch TestNNLeaseRecovery fails on windows trying to delete name-node storage directory. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-755) Read multiple checksum chunks at once in DFSInputStream
[ https://issues.apache.org/jira/browse/HDFS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12799108#action_12799108 ] Hudson commented on HDFS-755: - Integrated in Hdfs-Patch-h5.grid.sp2.yahoo.net #183 (See [http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/183/]) Read multiple checksum chunks at once in DFSInputStream --- Key: HDFS-755 URL: https://issues.apache.org/jira/browse/HDFS-755 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs client Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.22.0 Attachments: alldata-hdfs.tsv, benchmark-8-256.png, benchmark.png, hdfs-755.txt, hdfs-755.txt, hdfs-755.txt, hdfs-755.txt, hdfs-755.txt, hdfs-755.txt HADOOP-3205 adds the ability for FSInputChecker subclasses to read multiple checksum chunks in a single call to readChunk. This is the HDFS-side use of that new feature. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.