[jira] [Commented] (HDFS-4750) Support NFSv3 interface to HDFS
[ https://issues.apache.org/jira/browse/HDFS-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641470#comment-13641470 ] Brandon Li commented on HDFS-4750: -- {quote}This precludes having multiple NFS gateways in operation simultaneously for increased throughput, right?{quote} Not necessarily, it depends on the workloads and the application requirement. Even for a regular NFS server mounted to multiple clients, it could have the same issue. One way to synchronize the clienB-read-after-clienA-write is to use NFS lock manager(NLM) protocol(along with Network Status Monitor (NSM) protocol). In the first phase, it seems a bit overkill for the user cases we want to support. {quote} Even in a data loading situation, I'd expect a set of several gateway nodes to be used in round-robin in order to increase ingest throughput beyond what a single host can handle. {quote} Here what I want to mention is, as also in the proposal, one benefit of NFS support is to make it easier to integrate HDFS into client's file system namespace. The performance of NFS gateway is usually slower than using DFSClient directly. Loading file through NFS gateway can be faster than DFSClient only in a few cases, such as unstable writes with no commit after them immediately. With that said, its performance can be improved in the future by a few ways, such as better caching, pNFS support and etc. Support NFSv3 interface to HDFS --- Key: HDFS-4750 URL: https://issues.apache.org/jira/browse/HDFS-4750 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HADOOP-NFS-Proposal.pdf Access HDFS is usually done through HDFS Client or webHDFS. Lack of seamless integration with client’s file system makes it difficult for users and impossible for some applications to access HDFS. NFS interface support is one way for HDFS to have such easy integration. This JIRA is to track the NFS protocol support for accessing HDFS. With HDFS client, webHDFS and the NFS interface, HDFS will be easier to access and be able support more applications and use cases. We will upload the design document and the initial implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4721) Speed up lease/block recovery when DN fails and a block goes into recovery
[ https://issues.apache.org/jira/browse/HDFS-4721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641509#comment-13641509 ] Varun Sharma commented on HDFS-4721: [~szetszwo] Sorry, I traced this down to a bug in the client. HDFS lease recovery seems to be perfect... Varun Speed up lease/block recovery when DN fails and a block goes into recovery -- Key: HDFS-4721 URL: https://issues.apache.org/jira/browse/HDFS-4721 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.0.3-alpha Reporter: Varun Sharma Fix For: 2.0.4-alpha Attachments: 4721-hadoop2.patch, 4721-trunk.patch, 4721-trunk-v2.patch, 4721-trunk-v3.patch, 4721-v2.patch, 4721-v3.patch, 4721-v4.patch, 4721-v5.patch, 4721-v6.patch, 4721-v7.patch, 4721-v8.patch This was observed while doing HBase WAL recovery. HBase uses append to write to its write ahead log. So initially the pipeline is setup as DN1 -- DN2 -- DN3 This WAL needs to be read when DN1 fails since it houses the HBase regionserver for the WAL. HBase first recovers the lease on the WAL file. During recovery, we choose DN1 as the primary DN to do the recovery even though DN1 has failed and is not heartbeating any more. Avoiding the stale DN1 would speed up recovery and reduce hbase MTTR. There are two options. a) Ride on HDFS 3703 and if stale node detection is turned on, we do not choose stale datanodes (typically not heart beated for 20-30 seconds) as primary DN(s) b) We sort the replicas in order of last heart beat and always pick the ones which gave the most recent heart beat Going to the dead datanode increases lease + block recovery since the block goes into UNDER_RECOVERY state even though no one is recovering it actively. Please let me know if this makes sense. If yes, whether we should move forward with a) or b). Thanks -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4721) Speed up lease/block recovery when DN fails and a block goes into recovery
[ https://issues.apache.org/jira/browse/HDFS-4721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Sharma updated HDFS-4721: --- Attachment: 4721-trunk-v3.patch Speed up lease/block recovery when DN fails and a block goes into recovery -- Key: HDFS-4721 URL: https://issues.apache.org/jira/browse/HDFS-4721 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.0.3-alpha Reporter: Varun Sharma Fix For: 2.0.4-alpha Attachments: 4721-hadoop2.patch, 4721-trunk.patch, 4721-trunk-v2.patch, 4721-trunk-v3.patch, 4721-v2.patch, 4721-v3.patch, 4721-v4.patch, 4721-v5.patch, 4721-v6.patch, 4721-v7.patch, 4721-v8.patch This was observed while doing HBase WAL recovery. HBase uses append to write to its write ahead log. So initially the pipeline is setup as DN1 -- DN2 -- DN3 This WAL needs to be read when DN1 fails since it houses the HBase regionserver for the WAL. HBase first recovers the lease on the WAL file. During recovery, we choose DN1 as the primary DN to do the recovery even though DN1 has failed and is not heartbeating any more. Avoiding the stale DN1 would speed up recovery and reduce hbase MTTR. There are two options. a) Ride on HDFS 3703 and if stale node detection is turned on, we do not choose stale datanodes (typically not heart beated for 20-30 seconds) as primary DN(s) b) We sort the replicas in order of last heart beat and always pick the ones which gave the most recent heart beat Going to the dead datanode increases lease + block recovery since the block goes into UNDER_RECOVERY state even though no one is recovering it actively. Please let me know if this makes sense. If yes, whether we should move forward with a) or b). Thanks -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4712) New libhdfs method hdfsGetDataNodes
[ https://issues.apache.org/jira/browse/HDFS-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641522#comment-13641522 ] andrea manzi commented on HDFS-4712: Hi Colin, i was checking how to get a DFSClient from the existing FileSystem but i could not find anithing on the API. Could you please me provide more info about this? thanks a lot Andrea New libhdfs method hdfsGetDataNodes --- Key: HDFS-4712 URL: https://issues.apache.org/jira/browse/HDFS-4712 Project: Hadoop HDFS Issue Type: New Feature Components: libhdfs Reporter: andrea manzi we have implemented a possible extension to libhdfs to retrieve information about the available datanodes ( there was a mail on the hadoop-hdsf-dev mailing list initially abut this : http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201204.mbox/%3CCANhO- s0mvororrxpjnjbql6brkj4c7l+u816xkdc+2r0whj...@mail.gmail.com%3E) i would like to know how to proceed to create a patch, cause on the wiki http://wiki.apache.org/hadoop/HowToContribute i can see info about JAVA patches but nothing related to extensions in C. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4752) TestRBWBlockInvalidation fails on Windows due to file locking
Chris Nauroth created HDFS-4752: --- Summary: TestRBWBlockInvalidation fails on Windows due to file locking Key: HDFS-4752 URL: https://issues.apache.org/jira/browse/HDFS-4752 Project: Hadoop HDFS Issue Type: Bug Components: datanode, test Affects Versions: 3.0.0 Reporter: Chris Nauroth Assignee: Chris Nauroth The test attempts to invalidate a block by deleting its block file and meta file. This happens while a datanode thread holds the files open for write. On Windows, this causes a locking conflict, and the test fails. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4752) TestRBWBlockInvalidation fails on Windows due to file locking
[ https://issues.apache.org/jira/browse/HDFS-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-4752: Attachment: HDFS-4752.1.patch I'm attaching a patch that overrides the datanode behavior to open the block and meta files with share delete permission via a JNI call on Windows. This is a bit messy, so I'm curious to get feedback. We've solved similar problems in the past by stopping the daemon that holds the file opened before corrupting the underlying files, but in this case, stopping the daemon would ruin the intent of the test. I also didn't think it was correct in general to use share delete here, so I triggered the logic from a new config flag that is only set to true by this test. I verified that the test passes on Mac and Windows. TestRBWBlockInvalidation fails on Windows due to file locking - Key: HDFS-4752 URL: https://issues.apache.org/jira/browse/HDFS-4752 Project: Hadoop HDFS Issue Type: Bug Components: datanode, test Affects Versions: 3.0.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: HDFS-4752.1.patch The test attempts to invalidate a block by deleting its block file and meta file. This happens while a datanode thread holds the files open for write. On Windows, this causes a locking conflict, and the test fails. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4752) TestRBWBlockInvalidation fails on Windows due to file locking
[ https://issues.apache.org/jira/browse/HDFS-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-4752: Status: Patch Available (was: Open) TestRBWBlockInvalidation fails on Windows due to file locking - Key: HDFS-4752 URL: https://issues.apache.org/jira/browse/HDFS-4752 Project: Hadoop HDFS Issue Type: Bug Components: datanode, test Affects Versions: 3.0.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: HDFS-4752.1.patch The test attempts to invalidate a block by deleting its block file and meta file. This happens while a datanode thread holds the files open for write. On Windows, this causes a locking conflict, and the test fails. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4753) GSOC 2013: Develop HLBS(HDFS Based Log-Structured Block Storage System) as a back-end storage driver for VMs like QEMU/XEN, etc.
Harry Wei created HDFS-4753: --- Summary: GSOC 2013: Develop HLBS(HDFS Based Log-Structured Block Storage System) as a back-end storage driver for VMs like QEMU/XEN, etc. Key: HDFS-4753 URL: https://issues.apache.org/jira/browse/HDFS-4753 Project: Hadoop HDFS Issue Type: New Feature Components: libhdfs Affects Versions: HA branch (HDFS-1623) Environment: Currently, Unix/Unix-like/Linux are ok. See http://code.google.com/p/cloudxy/wiki/WHAT_IS_CLOUDXY for details. HLBS is one sub-project of CLoudxy. Reporter: Harry Wei Fix For: HA branch (HDFS-1623) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4753) GSOC 2013: Develop HLBS(HDFS Based Log-Structured Block Storage System) as a back-end storage driver for VMs like QEMU/XEN, etc.
[ https://issues.apache.org/jira/browse/HDFS-4753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harry Wei updated HDFS-4753: Environment: Currently, Unix/Unix-like/Linux are ok. See http://code.google.com/p/cloudxy/wiki/WHAT_IS_CLOUDXY for details. HLBS is one sub-project of CLoudxy. If any developer is interested with this for GSOC 2013, please contact me. My email is harryxi...@gmail.com. (was: Currently, Unix/Unix-like/Linux are ok. See http://code.google.com/p/cloudxy/wiki/WHAT_IS_CLOUDXY for details. HLBS is one sub-project of CLoudxy.) GSOC 2013: Develop HLBS(HDFS Based Log-Structured Block Storage System) as a back-end storage driver for VMs like QEMU/XEN, etc. Key: HDFS-4753 URL: https://issues.apache.org/jira/browse/HDFS-4753 Project: Hadoop HDFS Issue Type: New Feature Components: libhdfs Affects Versions: HA branch (HDFS-1623) Environment: Currently, Unix/Unix-like/Linux are ok. See http://code.google.com/p/cloudxy/wiki/WHAT_IS_CLOUDXY for details. HLBS is one sub-project of CLoudxy. If any developer is interested with this for GSOC 2013, please contact me. My email is harryxi...@gmail.com. Reporter: Harry Wei Fix For: HA branch (HDFS-1623) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4721) Speed up lease/block recovery when DN fails and a block goes into recovery
[ https://issues.apache.org/jira/browse/HDFS-4721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641577#comment-13641577 ] Hadoop QA commented on HDFS-4721: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12580494/4721-trunk-v3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4316//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4316//console This message is automatically generated. Speed up lease/block recovery when DN fails and a block goes into recovery -- Key: HDFS-4721 URL: https://issues.apache.org/jira/browse/HDFS-4721 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.0.3-alpha Reporter: Varun Sharma Fix For: 2.0.4-alpha Attachments: 4721-hadoop2.patch, 4721-trunk.patch, 4721-trunk-v2.patch, 4721-trunk-v3.patch, 4721-v2.patch, 4721-v3.patch, 4721-v4.patch, 4721-v5.patch, 4721-v6.patch, 4721-v7.patch, 4721-v8.patch This was observed while doing HBase WAL recovery. HBase uses append to write to its write ahead log. So initially the pipeline is setup as DN1 -- DN2 -- DN3 This WAL needs to be read when DN1 fails since it houses the HBase regionserver for the WAL. HBase first recovers the lease on the WAL file. During recovery, we choose DN1 as the primary DN to do the recovery even though DN1 has failed and is not heartbeating any more. Avoiding the stale DN1 would speed up recovery and reduce hbase MTTR. There are two options. a) Ride on HDFS 3703 and if stale node detection is turned on, we do not choose stale datanodes (typically not heart beated for 20-30 seconds) as primary DN(s) b) We sort the replicas in order of last heart beat and always pick the ones which gave the most recent heart beat Going to the dead datanode increases lease + block recovery since the block goes into UNDER_RECOVERY state even though no one is recovering it actively. Please let me know if this makes sense. If yes, whether we should move forward with a) or b). Thanks -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4752) TestRBWBlockInvalidation fails on Windows due to file locking
[ https://issues.apache.org/jira/browse/HDFS-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641608#comment-13641608 ] Hadoop QA commented on HDFS-4752: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12580499/HDFS-4752.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4317//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4317//console This message is automatically generated. TestRBWBlockInvalidation fails on Windows due to file locking - Key: HDFS-4752 URL: https://issues.apache.org/jira/browse/HDFS-4752 Project: Hadoop HDFS Issue Type: Bug Components: datanode, test Affects Versions: 3.0.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: HDFS-4752.1.patch The test attempts to invalidate a block by deleting its block file and meta file. This happens while a datanode thread holds the files open for write. On Windows, this causes a locking conflict, and the test fails. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4739) NN can miscalculate the number of extra edit log segments to retain
[ https://issues.apache.org/jira/browse/HDFS-4739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641656#comment-13641656 ] Hudson commented on HDFS-4739: -- Integrated in Hadoop-Yarn-trunk #194 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/194/]) HDFS-4739. NN can miscalculate the number of extra edit log segments to retain. Contributed by Aaron T. Myers. (Revision 1471769) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1471769 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NNStorageRetentionManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNNStorageRetentionManager.java NN can miscalculate the number of extra edit log segments to retain --- Key: HDFS-4739 URL: https://issues.apache.org/jira/browse/HDFS-4739 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.0.4-alpha Reporter: Aaron T. Myers Assignee: Aaron T. Myers Fix For: 2.0.5-beta Attachments: HDFS-4739.patch, HDFS-4739.patch The code in NNStorageRetentionManager#purgeOldStorage is intended to place a cap on the number of _extra_ edit log segments retained beyond what is strictly required to replay the FS history since the last fsimage. In fact this code currently places a limit on the _total_ number of extra edit log segments. If the number of required segments is greater than the configured cap, there will be no data loss, but an ugly error will be thrown and the NN will fail to start. The fix is simple, and in the meantime a work-around is just to raise the value of dfs.namenode.max.extra.edits.segments.retained and start the NN. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4745) TestDataTransferKeepalive#testSlowReader has race condition that causes sporadic failure
[ https://issues.apache.org/jira/browse/HDFS-4745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641657#comment-13641657 ] Hudson commented on HDFS-4745: -- Integrated in Hadoop-Yarn-trunk #194 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/194/]) HDFS-4745. TestDataTransferKeepalive#testSlowReader has race condition that causes sporadic failure. Contributed by Chris Nauroth. (Revision 1475623) Result = SUCCESS suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1475623 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDataTransferKeepalive.java TestDataTransferKeepalive#testSlowReader has race condition that causes sporadic failure Key: HDFS-4745 URL: https://issues.apache.org/jira/browse/HDFS-4745 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 3.0.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Fix For: 2.0.5-beta Attachments: HDFS-4745.1.patch The test asserts that the xceiver thread has stopped after timeout due to a slow reader, but the test's sleep time is too short, and the xceiver is often still running. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4339) Persist inode id in fsimage and editlog
[ https://issues.apache.org/jira/browse/HDFS-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641658#comment-13641658 ] Hudson commented on HDFS-4339: -- Integrated in Hadoop-Yarn-trunk #194 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/194/]) HDFS-4339. Add the missed entry CHANGES.txt from r1465835 (Revision 1471595) Result = SUCCESS suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1471595 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Persist inode id in fsimage and editlog --- Key: HDFS-4339 URL: https://issues.apache.org/jira/browse/HDFS-4339 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Fix For: 3.0.0 Attachments: editsStored, HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch Persist inode id in fsimage and editlog and update offline viewers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4536) Add test methods in org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler
[ https://issues.apache.org/jira/browse/HDFS-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dennis Y updated HDFS-4536: --- Attachment: HDFS-4536-trunk--N7.patch HDFS-4536-branch-2--N7.patch resolved merge conflicts Add test methods in org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler -- Key: HDFS-4536 URL: https://issues.apache.org/jira/browse/HDFS-4536 Project: Hadoop HDFS Issue Type: Test Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6 Reporter: Vadim Bondarev Attachments: HADOOP-4536-branch-2-a.patch, HADOOP-4536-branch-2c.patch, HADOOP-4536-trunk-a.patch, HADOOP-4536-trunk-c.patch, HDFS-4536-branch-2--N7.patch, HDFS-4536-trunk--N6.patch, HDFS-4536-trunk--N7.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4339) Persist inode id in fsimage and editlog
[ https://issues.apache.org/jira/browse/HDFS-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641756#comment-13641756 ] Hudson commented on HDFS-4339: -- Integrated in Hadoop-Hdfs-trunk #1383 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1383/]) HDFS-4339. Add the missed entry CHANGES.txt from r1465835 (Revision 1471595) Result = FAILURE suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1471595 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Persist inode id in fsimage and editlog --- Key: HDFS-4339 URL: https://issues.apache.org/jira/browse/HDFS-4339 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Fix For: 3.0.0 Attachments: editsStored, HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch Persist inode id in fsimage and editlog and update offline viewers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4686) Fix quota computation for rename with snapshots
[ https://issues.apache.org/jira/browse/HDFS-4686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641759#comment-13641759 ] Hudson commented on HDFS-4686: -- Integrated in Hadoop-Hdfs-Snapshots-Branch-build #168 (See [https://builds.apache.org/job/Hadoop-Hdfs-Snapshots-Branch-build/168/]) HDFS-4686. Update quota computation for rename and INodeReference. Contributed by Jing Zhao (Revision 1471647) Result = FAILURE szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1471647 Files : * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-2802.txt * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageSerialization.java * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INode.java * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectory.java * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectoryWithQuota.java * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeReference.java * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeSymlink.java * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/AbstractINodeDiffList.java * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/INodeDirectorySnapshottable.java * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/INodeDirectoryWithSnapshot.java * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/Snapshot.java * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotFSImageFormat.java * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestRenameWithSnapshots.java Fix quota computation for rename with snapshots --- Key: HDFS-4686 URL: https://issues.apache.org/jira/browse/HDFS-4686 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Jing Zhao Assignee: Jing Zhao Fix For: Snapshot (HDFS-2802) Attachments: HDFS-4686.001.patch, HDFS-4686.001.patch, HDFS-4686.002.patch, HDFS-4686.003.patch Currently after a rename operation within/from a snapshottable directory, a reference node is created in both src and dst subtree, pointing to the original renamed inode. With this change the original quota computation may count the quota usage of the renamed subtree multiple times. This jira tries to fix this issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4729) Fix OfflineImageViewer and permission checking for snapshot operations
[ https://issues.apache.org/jira/browse/HDFS-4729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641760#comment-13641760 ] Hudson commented on HDFS-4729: -- Integrated in Hadoop-Hdfs-Snapshots-Branch-build #168 (See [https://builds.apache.org/job/Hadoop-Hdfs-Snapshots-Branch-build/168/]) HDFS-4729. Fix OfflineImageViewer and permission checking for snapshot operations. Contributed by Jing Zhao (Revision 1471665) Result = FAILURE szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1471665 Files : * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-2802.txt * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/ImageLoaderCurrent.java * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/ImageVisitor.java * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshottableDirListing.java * /hadoop/common/branches/HDFS-2802/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/offlineImageViewer/TestOfflineImageViewer.java Fix OfflineImageViewer and permission checking for snapshot operations -- Key: HDFS-4729 URL: https://issues.apache.org/jira/browse/HDFS-4729 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Jing Zhao Assignee: Jing Zhao Fix For: Snapshot (HDFS-2802) Attachments: HDFS-4729.001.patch, HDFS-4729.002.patch The format of FSImage is updated after supporting rename with snapshots. We need to update OfflineImageViewer accordingly. Also, some permission checks in FSNamesystem are incorrect. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4536) Add test methods in org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler
[ https://issues.apache.org/jira/browse/HDFS-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641761#comment-13641761 ] Hadoop QA commented on HDFS-4536: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12580529/HDFS-4536-trunk--N7.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4318//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4318//console This message is automatically generated. Add test methods in org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler -- Key: HDFS-4536 URL: https://issues.apache.org/jira/browse/HDFS-4536 Project: Hadoop HDFS Issue Type: Test Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6 Reporter: Vadim Bondarev Attachments: HADOOP-4536-branch-2-a.patch, HADOOP-4536-branch-2c.patch, HADOOP-4536-trunk-a.patch, HADOOP-4536-trunk-c.patch, HDFS-4536-branch-2--N7.patch, HDFS-4536-trunk--N6.patch, HDFS-4536-trunk--N7.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4754) Add an API in the namenode to mark a datanode as stale
Nicolas Liochon created HDFS-4754: - Summary: Add an API in the namenode to mark a datanode as stale Key: HDFS-4754 URL: https://issues.apache.org/jira/browse/HDFS-4754 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client, namenode Reporter: Nicolas Liochon Priority: Critical There is a detection of the stale datanodes in HDFS since HDFS-3703, with a timeout, defaulted to 30s. There are two reasons to add an API to mark a node as stale even if the timeout is not yet reached: 1) ZooKeeper can detect that a client is dead at any moment. So, for HBase, we sometimes start the recovery before a node is marked staled. (even with reasonable settings as: stale: 20s; HBase ZK timeout: 30s 2) Some third parties could detect that a node is dead before the timeout, hence saving us the cost of retrying. An example or such hw is Arista, presented here by [~tsuna] http://tsunanet.net/~tsuna/fsf-hbase-meetup-april13.pdf, and confirmed in HBASE-6290. As usual, even if the node is dead it can comeback before the 10 minutes limit. So I would propose to set a timebound. The API would be namenode.markStale(String ipAddress, int port, long durationInMs); After durationInMs, the namenode would again rely only on its heartbeat to decide. Thoughts? If there is no objections, and if nobody in the hdfs dev team has the time to spend some time on it, I will give it a try for branch 2 3. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4739) NN can miscalculate the number of extra edit log segments to retain
[ https://issues.apache.org/jira/browse/HDFS-4739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641830#comment-13641830 ] Hudson commented on HDFS-4739: -- Integrated in Hadoop-Mapreduce-trunk #1410 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1410/]) HDFS-4739. NN can miscalculate the number of extra edit log segments to retain. Contributed by Aaron T. Myers. (Revision 1471769) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1471769 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NNStorageRetentionManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNNStorageRetentionManager.java NN can miscalculate the number of extra edit log segments to retain --- Key: HDFS-4739 URL: https://issues.apache.org/jira/browse/HDFS-4739 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.0.4-alpha Reporter: Aaron T. Myers Assignee: Aaron T. Myers Fix For: 2.0.5-beta Attachments: HDFS-4739.patch, HDFS-4739.patch The code in NNStorageRetentionManager#purgeOldStorage is intended to place a cap on the number of _extra_ edit log segments retained beyond what is strictly required to replay the FS history since the last fsimage. In fact this code currently places a limit on the _total_ number of extra edit log segments. If the number of required segments is greater than the configured cap, there will be no data loss, but an ugly error will be thrown and the NN will fail to start. The fix is simple, and in the meantime a work-around is just to raise the value of dfs.namenode.max.extra.edits.segments.retained and start the NN. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4745) TestDataTransferKeepalive#testSlowReader has race condition that causes sporadic failure
[ https://issues.apache.org/jira/browse/HDFS-4745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641831#comment-13641831 ] Hudson commented on HDFS-4745: -- Integrated in Hadoop-Mapreduce-trunk #1410 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1410/]) HDFS-4745. TestDataTransferKeepalive#testSlowReader has race condition that causes sporadic failure. Contributed by Chris Nauroth. (Revision 1475623) Result = SUCCESS suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1475623 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDataTransferKeepalive.java TestDataTransferKeepalive#testSlowReader has race condition that causes sporadic failure Key: HDFS-4745 URL: https://issues.apache.org/jira/browse/HDFS-4745 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 3.0.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Fix For: 2.0.5-beta Attachments: HDFS-4745.1.patch The test asserts that the xceiver thread has stopped after timeout due to a slow reader, but the test's sleep time is too short, and the xceiver is often still running. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4339) Persist inode id in fsimage and editlog
[ https://issues.apache.org/jira/browse/HDFS-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641832#comment-13641832 ] Hudson commented on HDFS-4339: -- Integrated in Hadoop-Mapreduce-trunk #1410 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1410/]) HDFS-4339. Add the missed entry CHANGES.txt from r1465835 (Revision 1471595) Result = SUCCESS suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1471595 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Persist inode id in fsimage and editlog --- Key: HDFS-4339 URL: https://issues.apache.org/jira/browse/HDFS-4339 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Fix For: 3.0.0 Attachments: editsStored, HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch, HDFS-4339.patch Persist inode id in fsimage and editlog and update offline viewers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3934) duplicative dfs_hosts entries handled wrong
[ https://issues.apache.org/jira/browse/HDFS-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641848#comment-13641848 ] Daryn Sharp commented on HDFS-3934: --- Quick review just by eyeballing the patch: It'd be nice to continue to use the {{HostsFileReader}} and post-process the result. Otherwise it's a consistency/maintenance to copy-n-paste any new parsing functionality. Why does the reader need to instantiate dummy {{DatanodeID}}? It appears to be for repeatedly making the somewhat fragile assumption that xferAddr is ipAddr+port? If that relationship changes, we've got a problem... Patch appears to have dropped supported for the node's registration name. [~eli] wanted me to maintain that feature in HDFS-3990. If we need to keep it, doing a lookup and a canonical lookup (can trigger another dns lookup) isn't compatible with supporting the reg name. Doing a lookup followed by {{getCanonicalName}} is a bad idea. It does 2 more lookups: hostname - PTR - A so it can resolve CNAMES to IP to hostname. With this change I think it will cause 3 lookups per host. Question about // If no transfer port was specified, we take a guess. Why needed, and what are the ramifications for getting this wrong? Just a display issue? It _feels_ like de-dupping the display should be a bit easier to do w/o changing core node registration logic? duplicative dfs_hosts entries handled wrong --- Key: HDFS-3934 URL: https://issues.apache.org/jira/browse/HDFS-3934 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.1-alpha Reporter: Andy Isaacson Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-3934.001.patch, HDFS-3934.002.patch, HDFS-3934.003.patch, HDFS-3934.004.patch, HDFS-3934.005.patch, HDFS-3934.006.patch, HDFS-3934.007.patch, HDFS-3934.008.patch A dead DN listed in dfs_hosts_allow.txt by IP and in dfs_hosts_exclude.txt by hostname ends up being displayed twice in {{dfsnodelist.jsp?whatNodes=DEAD}} after the NN restarts because {{getDatanodeListForReport}} does not handle such a pseudo-duplicate correctly: # the Remove any nodes we know about from the map loop no longer has the knowledge to remove the spurious entries # the The remaining nodes are ones that are referenced by the hosts files loop does not do hostname lookups, so does not know that the IP and hostname refer to the same host. Relatedly, such an IP-based dfs_hosts entry results in a cosmetic problem in the JSP output: The *Node* column shows :50010 as the nodename, with HTML markup {{a href=http://:50075/browseDirectory.jsp?namenodeInfoPort=50070amp;dir=%2Famp;nnaddr=172.29.97.196:8020; title=172.29.97.216:50010:50010/a}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4750) Support NFSv3 interface to HDFS
[ https://issues.apache.org/jira/browse/HDFS-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641855#comment-13641855 ] Brock Noland commented on HDFS-4750: Hi Brandon, Thank you for the quick response! bq. 10 milliseconds is the time from the reference paper. In the initial implementation, we used 10 seconds just to be on the safe side. What happens if the 10 seconds expires and the prerequisite write has not been received? The biggest issue I had when moving the proxy from basically working to handling multiple heavy use clients was memory consumption while waiting for pre-requisite writes. I eventually had to write pending writes to a file. This is documented in this issue https://github.com/brockn/hdfs-nfs-proxy/issues/7 bq. Regarding small file append, it starts from the correct offset in the tests I observed. For example, I tried echo abcd /mnt_test/file_with_5bytes, the write request starts with offset 5. With the initial file loading tests with Linux/Mac clients, so far we haven't encountered the problem you mentioned. Interesting, what version of linux have tried? I believe I was using RHEL 5.X. bq. For the second question, as long as the second user uses NFS gateway to read the closed file, the second user should be able to get the data buffered in NFS gateway. For the opened files, NFS gateway also saves their latest file size. When it serves getattr request, it gets file attributes from HDFS and then update the file length based on the cached length. Cool, my question was more around how we are going to make our users aware of this limitation. I could imagine many users believing once they have closed a file via NFS that exact file will be available via one the other APIs. We'll need to make this limitation blatantly obvious to users otherwise it will likely become a support headache. Additionally, is there anything the user can do to force the writes? i.e. If the user has control over the program, could they do a fsync(fd) to force the flush? Cheers, Brock Support NFSv3 interface to HDFS --- Key: HDFS-4750 URL: https://issues.apache.org/jira/browse/HDFS-4750 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HADOOP-NFS-Proposal.pdf Access HDFS is usually done through HDFS Client or webHDFS. Lack of seamless integration with client’s file system makes it difficult for users and impossible for some applications to access HDFS. NFS interface support is one way for HDFS to have such easy integration. This JIRA is to track the NFS protocol support for accessing HDFS. With HDFS client, webHDFS and the NFS interface, HDFS will be easier to access and be able support more applications and use cases. We will upload the design document and the initial implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4750) Support NFSv3 interface to HDFS
[ https://issues.apache.org/jira/browse/HDFS-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641864#comment-13641864 ] Daryn Sharp commented on HDFS-4750: --- This part seems a bit worrisome: bq. The solution is to close the stream after it’s idle(no write) for a certain period(e.g., 10 seconds). The subsequent write will become append and open the stream again. This is very semantically wrong. If another client appended to the file in the interim, the file position _should not_ implicitly move to the end of the file. Assuming the proposed approach is otherwise valid: when the client attempts to write again via append, it should throw an exception if the file size is greater than the client's current position in the stream. Even that breaks POSIX semantics, but it's less wrong by not causing the potential for garbled data. Support NFSv3 interface to HDFS --- Key: HDFS-4750 URL: https://issues.apache.org/jira/browse/HDFS-4750 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HADOOP-NFS-Proposal.pdf Access HDFS is usually done through HDFS Client or webHDFS. Lack of seamless integration with client’s file system makes it difficult for users and impossible for some applications to access HDFS. NFS interface support is one way for HDFS to have such easy integration. This JIRA is to track the NFS protocol support for accessing HDFS. With HDFS client, webHDFS and the NFS interface, HDFS will be easier to access and be able support more applications and use cases. We will upload the design document and the initial implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4721) Speed up lease/block recovery when DN fails and a block goes into recovery
[ https://issues.apache.org/jira/browse/HDFS-4721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641884#comment-13641884 ] Varun Sharma commented on HDFS-4721: [~szetszwo] The tests are passing with the latest patch. Should we modify the description for stale node interval and suggest that we use it for block recovery as well ? Speed up lease/block recovery when DN fails and a block goes into recovery -- Key: HDFS-4721 URL: https://issues.apache.org/jira/browse/HDFS-4721 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.0.3-alpha Reporter: Varun Sharma Fix For: 2.0.4-alpha Attachments: 4721-hadoop2.patch, 4721-trunk.patch, 4721-trunk-v2.patch, 4721-trunk-v3.patch, 4721-v2.patch, 4721-v3.patch, 4721-v4.patch, 4721-v5.patch, 4721-v6.patch, 4721-v7.patch, 4721-v8.patch This was observed while doing HBase WAL recovery. HBase uses append to write to its write ahead log. So initially the pipeline is setup as DN1 -- DN2 -- DN3 This WAL needs to be read when DN1 fails since it houses the HBase regionserver for the WAL. HBase first recovers the lease on the WAL file. During recovery, we choose DN1 as the primary DN to do the recovery even though DN1 has failed and is not heartbeating any more. Avoiding the stale DN1 would speed up recovery and reduce hbase MTTR. There are two options. a) Ride on HDFS 3703 and if stale node detection is turned on, we do not choose stale datanodes (typically not heart beated for 20-30 seconds) as primary DN(s) b) We sort the replicas in order of last heart beat and always pick the ones which gave the most recent heart beat Going to the dead datanode increases lease + block recovery since the block goes into UNDER_RECOVERY state even though no one is recovering it actively. Please let me know if this makes sense. If yes, whether we should move forward with a) or b). Thanks -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4750) Support NFSv3 interface to HDFS
[ https://issues.apache.org/jira/browse/HDFS-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641896#comment-13641896 ] Allen Wittenauer commented on HDFS-4750: What are the plans around RPCSEC and GSSAPI mapping capabilities? While I recognize that these are optional to the NFSv3 specs, a lot of folks need them in order to use this feature. It is probably also worth pointing out that NFSv4 and higher fix this mistake and require the security pieces to be there in order to be RFC compliant. So if we want to implement pNFS, we have to do this work anyway. Support NFSv3 interface to HDFS --- Key: HDFS-4750 URL: https://issues.apache.org/jira/browse/HDFS-4750 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HADOOP-NFS-Proposal.pdf Access HDFS is usually done through HDFS Client or webHDFS. Lack of seamless integration with client’s file system makes it difficult for users and impossible for some applications to access HDFS. NFS interface support is one way for HDFS to have such easy integration. This JIRA is to track the NFS protocol support for accessing HDFS. With HDFS client, webHDFS and the NFS interface, HDFS will be easier to access and be able support more applications and use cases. We will upload the design document and the initial implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4750) Support NFSv3 interface to HDFS
[ https://issues.apache.org/jira/browse/HDFS-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641900#comment-13641900 ] Brock Noland commented on HDFS-4750: I didn't spend too much time looking at NFSv3 security but FWIW the NFS4 proxy supports Kerberos in privacy mode. This code might be of use. Support NFSv3 interface to HDFS --- Key: HDFS-4750 URL: https://issues.apache.org/jira/browse/HDFS-4750 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HADOOP-NFS-Proposal.pdf Access HDFS is usually done through HDFS Client or webHDFS. Lack of seamless integration with client’s file system makes it difficult for users and impossible for some applications to access HDFS. NFS interface support is one way for HDFS to have such easy integration. This JIRA is to track the NFS protocol support for accessing HDFS. With HDFS client, webHDFS and the NFS interface, HDFS will be easier to access and be able support more applications and use cases. We will upload the design document and the initial implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4053) Increase the default block size
[ https://issues.apache.org/jira/browse/HDFS-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-4053: -- Fix Version/s: (was: 3.0.0) 2.0.5-beta Release Note: The default blocks size prior to this change was 64MB. This jira changes the default block size to 128MB. To go back to previous behavior, please configure the in hdfs-site.xml, the configuration parameter dfs.blocksize to 67108864. I merged this change to branch-2 to be picked up for 2.0.5-beta. I have also updated the release notes. Increase the default block size --- Key: HDFS-4053 URL: https://issues.apache.org/jira/browse/HDFS-4053 Project: Hadoop HDFS Issue Type: Improvement Reporter: Eli Collins Assignee: Eli Collins Fix For: 2.0.5-beta Attachments: hdfs-4053.txt, hdfs-4053.txt, hdfs-4053.txt The default HDFS block size ({{dfs.blocksize}}) has been 64mb forever. 128mb works well in practice on today's hardware configurations, most clusters I work with use it or higher (eg 256mb). Let's bump to 128mb in trunk for v3. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4750) Support NFSv3 interface to HDFS
[ https://issues.apache.org/jira/browse/HDFS-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641973#comment-13641973 ] Brandon Li commented on HDFS-4750: -- {quote}What are the plans around RPCSEC and GSSAPI mapping capabilities?{quote} The initial implementation doesn't have it but I agree it would be nice to support it sooner than later. {quote}...This code might be of use.{quote} Sounds like a plan :-) Support NFSv3 interface to HDFS --- Key: HDFS-4750 URL: https://issues.apache.org/jira/browse/HDFS-4750 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HADOOP-NFS-Proposal.pdf Access HDFS is usually done through HDFS Client or webHDFS. Lack of seamless integration with client’s file system makes it difficult for users and impossible for some applications to access HDFS. NFS interface support is one way for HDFS to have such easy integration. This JIRA is to track the NFS protocol support for accessing HDFS. With HDFS client, webHDFS and the NFS interface, HDFS will be easier to access and be able support more applications and use cases. We will upload the design document and the initial implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4300) TransferFsImage.downloadEditsToStorage should use a tmp file for destination
[ https://issues.apache.org/jira/browse/HDFS-4300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641986#comment-13641986 ] Colin Patrick McCabe commented on HDFS-4300: I could be off base here, but I think there is some value in giving the temporary files unique names. Failed storage directories can come back (at least on the primary NN; haven't checked if this applies to the 2NN) if the correct configuration is set. Sure, everything should work correctly, but why take that chance when you can just use a unique file name? TransferFsImage.downloadEditsToStorage should use a tmp file for destination Key: HDFS-4300 URL: https://issues.apache.org/jira/browse/HDFS-4300 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.2-alpha Reporter: Todd Lipcon Assignee: Andrew Wang Priority: Critical Attachments: hdfs-4300-1.patch Currently, in TransferFsImage.downloadEditsToStorage, we download the edits file directly to its finalized path. So, if the transfer fails in the middle, a half-written file is left and cannot be distinguished from a correct file. So, future checkpoints by the 2NN will fail, since the file is truncated in the middle -- but it won't ever download a good copy because it thinks it already has the proper file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4750) Support NFSv3 interface to HDFS
[ https://issues.apache.org/jira/browse/HDFS-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642000#comment-13642000 ] Brandon Li commented on HDFS-4750: -- {quote}This is very semantically wrong. If another client appended to the file in the interim, the file position should not implicitly move to the end of the file. {quote} When the stream is closed, the file size is updated in HDFS. Before it's closed, the same client still holds the lease. {quote}Assuming the proposed approach is otherwise valid: when the client attempts to write again via append, it should throw an exception if the file size is greater than the client's current position in the stream. Even that breaks POSIX semantics, but it's less wrong by not causing the potential for garbled data.{quote} If the file is appended by another client, the first client's new write before the file's EOF becomes random write and would fail with exception. What breaks POSIX semantic here is that random write is not support. Support NFSv3 interface to HDFS --- Key: HDFS-4750 URL: https://issues.apache.org/jira/browse/HDFS-4750 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HADOOP-NFS-Proposal.pdf Access HDFS is usually done through HDFS Client or webHDFS. Lack of seamless integration with client’s file system makes it difficult for users and impossible for some applications to access HDFS. NFS interface support is one way for HDFS to have such easy integration. This JIRA is to track the NFS protocol support for accessing HDFS. With HDFS client, webHDFS and the NFS interface, HDFS will be easier to access and be able support more applications and use cases. We will upload the design document and the initial implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4434) Provide a mapping from INodeId to INode
[ https://issues.apache.org/jira/browse/HDFS-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642015#comment-13642015 ] Konstantin Shvachko commented on HDFS-4434: --- On what basis this was merged to branch 2.0.5? Provide a mapping from INodeId to INode --- Key: HDFS-4434 URL: https://issues.apache.org/jira/browse/HDFS-4434 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Suresh Srinivas Fix For: 3.0.0, 2.0.5-beta Attachments: HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch This JIRA is to provide a way to access the INode via its id. The proposed solution is to have an in-memory mapping from INodeId to INode. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4712) New libhdfs method hdfsGetDataNodes
[ https://issues.apache.org/jira/browse/HDFS-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642043#comment-13642043 ] Colin Patrick McCabe commented on HDFS-4712: Hi Andrea, I recommend casting the {{FileSystem}} to {{DistributedFileSystem}} and then invoking {{DistributedFileSystem#getDataNodeStats}}. No need for a DFSClient (there is a way to get that, but it's discouraged). New libhdfs method hdfsGetDataNodes --- Key: HDFS-4712 URL: https://issues.apache.org/jira/browse/HDFS-4712 Project: Hadoop HDFS Issue Type: New Feature Components: libhdfs Reporter: andrea manzi we have implemented a possible extension to libhdfs to retrieve information about the available datanodes ( there was a mail on the hadoop-hdsf-dev mailing list initially abut this : http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201204.mbox/%3CCANhO- s0mvororrxpjnjbql6brkj4c7l+u816xkdc+2r0whj...@mail.gmail.com%3E) i would like to know how to proceed to create a patch, cause on the wiki http://wiki.apache.org/hadoop/HowToContribute i can see info about JAVA patches but nothing related to extensions in C. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4434) Provide a mapping from INodeId to INode
[ https://issues.apache.org/jira/browse/HDFS-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642036#comment-13642036 ] Suresh Srinivas commented on HDFS-4434: --- bq. On what basis this was merged to branch 2.0.5? See - https://issues.apache.org/jira/secure/EditComment!default.jspa?id=12631775commentId=13640438 What is the concern? Provide a mapping from INodeId to INode --- Key: HDFS-4434 URL: https://issues.apache.org/jira/browse/HDFS-4434 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Suresh Srinivas Fix For: 3.0.0, 2.0.5-beta Attachments: HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch This JIRA is to provide a way to access the INode via its id. The proposed solution is to have an in-memory mapping from INodeId to INode. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4721) Speed up lease/block recovery when DN fails and a block goes into recovery
[ https://issues.apache.org/jira/browse/HDFS-4721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642050#comment-13642050 ] Tsz Wo (Nicholas), SZE commented on HDFS-4721: -- That's great. Please feel free to modify the description. Speed up lease/block recovery when DN fails and a block goes into recovery -- Key: HDFS-4721 URL: https://issues.apache.org/jira/browse/HDFS-4721 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.0.3-alpha Reporter: Varun Sharma Fix For: 2.0.4-alpha Attachments: 4721-hadoop2.patch, 4721-trunk.patch, 4721-trunk-v2.patch, 4721-trunk-v3.patch, 4721-v2.patch, 4721-v3.patch, 4721-v4.patch, 4721-v5.patch, 4721-v6.patch, 4721-v7.patch, 4721-v8.patch This was observed while doing HBase WAL recovery. HBase uses append to write to its write ahead log. So initially the pipeline is setup as DN1 -- DN2 -- DN3 This WAL needs to be read when DN1 fails since it houses the HBase regionserver for the WAL. HBase first recovers the lease on the WAL file. During recovery, we choose DN1 as the primary DN to do the recovery even though DN1 has failed and is not heartbeating any more. Avoiding the stale DN1 would speed up recovery and reduce hbase MTTR. There are two options. a) Ride on HDFS 3703 and if stale node detection is turned on, we do not choose stale datanodes (typically not heart beated for 20-30 seconds) as primary DN(s) b) We sort the replicas in order of last heart beat and always pick the ones which gave the most recent heart beat Going to the dead datanode increases lease + block recovery since the block goes into UNDER_RECOVERY state even though no one is recovering it actively. Please let me know if this makes sense. If yes, whether we should move forward with a) or b). Thanks -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4750) Support NFSv3 interface to HDFS
[ https://issues.apache.org/jira/browse/HDFS-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642052#comment-13642052 ] Daryn Sharp commented on HDFS-4750: --- bq. If the file is appended by another client, the first client's new write before the file's EOF becomes random write and would fail with exception. What breaks POSIX semantic here is that random write is not support. Ok, we're in 100% agreement. The doc is just ambiguous. Support NFSv3 interface to HDFS --- Key: HDFS-4750 URL: https://issues.apache.org/jira/browse/HDFS-4750 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HADOOP-NFS-Proposal.pdf Access HDFS is usually done through HDFS Client or webHDFS. Lack of seamless integration with client’s file system makes it difficult for users and impossible for some applications to access HDFS. NFS interface support is one way for HDFS to have such easy integration. This JIRA is to track the NFS protocol support for accessing HDFS. With HDFS client, webHDFS and the NFS interface, HDFS will be easier to access and be able support more applications and use cases. We will upload the design document and the initial implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4749) Use INodeId to identify the corresponding directory node for FSImage saving/loading
[ https://issues.apache.org/jira/browse/HDFS-4749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-4749: - Component/s: (was: datanode) Hadoop Flags: Reviewed +1 patch looks good. This cleans up a lot of code. Use INodeId to identify the corresponding directory node for FSImage saving/loading --- Key: HDFS-4749 URL: https://issues.apache.org/jira/browse/HDFS-4749 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Jing Zhao Assignee: Jing Zhao Attachments: HDFS-4749.000.patch Currently in fsimage, we use the path to locate a directory node for later loading, i.e., when loading a subtree from fsimage, we first read the path of the directory node, and resolve the path to identify the directory node. This brings extra complexity since we need to generate path for directory nodes in both the current tree and snapshot copies. As a simplification, we can use INodeId to identify the directory node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4750) Support NFSv3 interface to HDFS
[ https://issues.apache.org/jira/browse/HDFS-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642065#comment-13642065 ] Brandon Li commented on HDFS-4750: -- @Brock {quote} What happens if the 10 seconds expires and the prerequisite write has not been received? The biggest issue I had when moving the proxy from basically working to handling multiple heavy use clients was memory consumption while waiting for pre-requisite writes. I eventually had to write pending writes to a file. This is documented in this issue https://github.com/brockn/hdfs-nfs-proxy/issues/7 {quote} The pending write requests will fail after timeout. Saving pending writes in files can help in some cases but also introduces some problems. First, it doesn't eliminate the problem. The prerequisite write may never arrive if 10 seconds is not long enough. Even the prerequisite write arrives finally, the accumulated writes in the file may have timed out. Secondly, it makes the server stateful (or have more state information). To support HA later, we have to move the state information from one NFS gateway to another in order to recover. If the state recovery takes too long to finish, it can cause the clients new requests to fail. More testing and research work is needed here. {quote} Interesting, what version of linux have tried? I believe I was using RHEL 5.X. {quote} CentOS6.3 and Mac 10.7.5 {quote} Additionally, is there anything the user can do to force the writes? i.e. If the user has control over the program, could they do a fsync(fd) to force the flush? {quote} fsync could trigger NFS commit, which will sync and persist the data. {quote} I could imagine many users believing once they have closed a file via NFS that exact file will be available via one the other APIs. We'll need to make this limitation blatantly obvious to users otherwise it will likely become a support headache.{quote} If the application expects that, after closing file through one NFS gateway, the new data is immediately available to all other NFS gateways, the application should do a sync call after close. This is not a limitation only to this NFS implemtation. POSIX close doesn't sync data implicitly. Support NFSv3 interface to HDFS --- Key: HDFS-4750 URL: https://issues.apache.org/jira/browse/HDFS-4750 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HADOOP-NFS-Proposal.pdf Access HDFS is usually done through HDFS Client or webHDFS. Lack of seamless integration with client’s file system makes it difficult for users and impossible for some applications to access HDFS. NFS interface support is one way for HDFS to have such easy integration. This JIRA is to track the NFS protocol support for accessing HDFS. With HDFS client, webHDFS and the NFS interface, HDFS will be easier to access and be able support more applications and use cases. We will upload the design document and the initial implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4742) Fix appending a renamed file
[ https://issues.apache.org/jira/browse/HDFS-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-4742: - Component/s: (was: datanode) Hadoop Flags: Reviewed +1 patch looks good. Fix appending a renamed file Key: HDFS-4742 URL: https://issues.apache.org/jira/browse/HDFS-4742 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Jing Zhao Assignee: Jing Zhao Attachments: HDFS-4742.001.patch Fix bug for appending a renamed file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4750) Support NFSv3 interface to HDFS
[ https://issues.apache.org/jira/browse/HDFS-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642068#comment-13642068 ] Todd Lipcon commented on HDFS-4750: --- {quote} If the application expects that, after closing file through one NFS gateway, the new data is immediately available to all other NFS gateways, the application should do a sync call after close. This is not a limitation only to this NFS implemtation. POSIX close doesn't sync data implicitly. {quote} I don't think this is right. POSIX doesn't ensure that close() syncs data (makes it durable), but NFS _does_ require that close() makes it _visible_ to other clients (so-called close-to-open consistency): {quote} The NFS standard requires clients to maintain close-to-open cache coherency when multiple clients access the same files [5, 6, 10]. This means flushing all file data and metadata changes when a client closes a file, and immediately and unconditionally retrieving a file's attributes when it is opened via the open() system call API. In this way, changes made by one client appear as soon as a file is opened on any other client. {quote} (from http://www.citi.umich.edu/projects/nfs-perf/results/cel/dnlc.html) Support NFSv3 interface to HDFS --- Key: HDFS-4750 URL: https://issues.apache.org/jira/browse/HDFS-4750 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HADOOP-NFS-Proposal.pdf Access HDFS is usually done through HDFS Client or webHDFS. Lack of seamless integration with client’s file system makes it difficult for users and impossible for some applications to access HDFS. NFS interface support is one way for HDFS to have such easy integration. This JIRA is to track the NFS protocol support for accessing HDFS. With HDFS client, webHDFS and the NFS interface, HDFS will be easier to access and be able support more applications and use cases. We will upload the design document and the initial implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-4749) Use INodeId to identify the corresponding directory node for FSImage saving/loading
[ https://issues.apache.org/jira/browse/HDFS-4749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE resolved HDFS-4749. -- Resolution: Fixed Fix Version/s: Snapshot (HDFS-2802) I have committed this. Thanks, Jing! Use INodeId to identify the corresponding directory node for FSImage saving/loading --- Key: HDFS-4749 URL: https://issues.apache.org/jira/browse/HDFS-4749 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Jing Zhao Assignee: Jing Zhao Fix For: Snapshot (HDFS-2802) Attachments: HDFS-4749.000.patch Currently in fsimage, we use the path to locate a directory node for later loading, i.e., when loading a subtree from fsimage, we first read the path of the directory node, and resolve the path to identify the directory node. This brings extra complexity since we need to generate path for directory nodes in both the current tree and snapshot copies. As a simplification, we can use INodeId to identify the directory node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4742) Fix appending to a renamed file with snapshot
[ https://issues.apache.org/jira/browse/HDFS-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-4742: - Summary: Fix appending to a renamed file with snapshot (was: Fix appending a renamed file) Fix appending to a renamed file with snapshot - Key: HDFS-4742 URL: https://issues.apache.org/jira/browse/HDFS-4742 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Jing Zhao Assignee: Jing Zhao Attachments: HDFS-4742.001.patch Fix bug for appending a renamed file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-4742) Fix appending to a renamed file with snapshot
[ https://issues.apache.org/jira/browse/HDFS-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE resolved HDFS-4742. -- Resolution: Fixed Fix Version/s: Snapshot (HDFS-2802) I have committed this. Thanks, Jing! Fix appending to a renamed file with snapshot - Key: HDFS-4742 URL: https://issues.apache.org/jira/browse/HDFS-4742 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Jing Zhao Assignee: Jing Zhao Fix For: Snapshot (HDFS-2802) Attachments: HDFS-4742.001.patch Fix bug for appending a renamed file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HDFS-4755) AccessControlException message is changed in snapshot branch
[ https://issues.apache.org/jira/browse/HDFS-4755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE reassigned HDFS-4755: Assignee: Tsz Wo (Nicholas), SZE AccessControlException message is changed in snapshot branch Key: HDFS-4755 URL: https://issues.apache.org/jira/browse/HDFS-4755 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Priority: Minor [~rramya] observed the following - Trunk: mkdir: org.apache.hadoop.security.AccessControlException: Permission denied: user=hrt_qa, access=WRITE, inode=hdfs:hdfs:hdfs:rwx-x-x - Snapshot branch: mkdir: org.apache.hadoop.security.AccessControlException: Permission denied: user=hrt_qa, access=WRITE, inode=/user/hdfs -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4434) Provide a mapping from INodeId to INode
[ https://issues.apache.org/jira/browse/HDFS-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642096#comment-13642096 ] Konstantin Shvachko commented on HDFS-4434: --- Sorry now I see what I missed. Was following this jira. Will post my concerns under HDFS-4489. Provide a mapping from INodeId to INode --- Key: HDFS-4434 URL: https://issues.apache.org/jira/browse/HDFS-4434 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Suresh Srinivas Fix For: 3.0.0, 2.0.5-beta Attachments: HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch This JIRA is to provide a way to access the INode via its id. The proposed solution is to have an in-memory mapping from INodeId to INode. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4750) Support NFSv3 interface to HDFS
[ https://issues.apache.org/jira/browse/HDFS-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642102#comment-13642102 ] Hari Mankude commented on HDFS-4750: Implementing writes might not be easy. The client implementations in various kernels does not guarantee that the writes are issued in sequential order. Page flushing algorithms try to find contiguous pages (offsets). However, there are other factors in play with page flushing algorithms. So it does not imply that writes from the client has to be sequential as HDFS requires it to be. This is true whether the writes are coming in lazily from the client or due to a sync() before close(). A possible solution is for nfs gateway on dfs client to cache and reorder the writes to be sequential. But, this might still result in holes which hdfs cannot handle. Also, the cache requirements might not be trivial and might require a flush to local disk. NFS interfaces are very useful for reads. Support NFSv3 interface to HDFS --- Key: HDFS-4750 URL: https://issues.apache.org/jira/browse/HDFS-4750 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HADOOP-NFS-Proposal.pdf Access HDFS is usually done through HDFS Client or webHDFS. Lack of seamless integration with client’s file system makes it difficult for users and impossible for some applications to access HDFS. NFS interface support is one way for HDFS to have such easy integration. This JIRA is to track the NFS protocol support for accessing HDFS. With HDFS client, webHDFS and the NFS interface, HDFS will be easier to access and be able support more applications and use cases. We will upload the design document and the initial implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4489) Use InodeID as as an identifier of a file in HDFS protocols and APIs
[ https://issues.apache.org/jira/browse/HDFS-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642118#comment-13642118 ] Konstantin Shvachko commented on HDFS-4489: --- Posted a request for the bases for porting this to branch 2.0.5 in HDFS-4434. suresh What is the concern? My concern is that you committed incompatible change, which is a new feature and a large change, into the stabilization branch without a vote or a release plan discussed with the community. Being a bad practice in general, I think it is a wrong move now in particular, because people are discussing the stabilization of 2.0.5. This feature totals about 150K of code in patches (counting subtasks only). Not helping stabilization. And you didn't give any reasons for the merge. I would like to ask to revert this merge from branch 2.0.5 and follow the procedures for merging features into new release branches if you decide to proceed. Use InodeID as as an identifier of a file in HDFS protocols and APIs Key: HDFS-4489 URL: https://issues.apache.org/jira/browse/HDFS-4489 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Brandon Li Assignee: Brandon Li Fix For: 2.0.5-beta The benefit of using InodeID to uniquely identify a file can be multiple folds. Here are a few of them: 1. uniquely identify a file cross rename, related JIRAs include HDFS-4258, HDFS-4437. 2. modification checks in tools like distcp. Since a file could have been replaced or renamed to, the file name and size combination is no t reliable, but the combination of file id and size is unique. 3. id based protocol support (e.g., NFS) 4. to make the pluggable block placement policy use fileid instead of filename (HDFS-385). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4650) Add rename test in TestSnapshot
[ https://issues.apache.org/jira/browse/HDFS-4650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-4650: Summary: Add rename test in TestSnapshot (was: Add unit tests for rename with snapshots) Add rename test in TestSnapshot --- Key: HDFS-4650 URL: https://issues.apache.org/jira/browse/HDFS-4650 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Jing Zhao Assignee: Jing Zhao Add more unit tests and update current unit tests to cover different cases for rename with existence of snapshottable directories and snapshots. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4750) Support NFSv3 interface to HDFS
[ https://issues.apache.org/jira/browse/HDFS-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642145#comment-13642145 ] Brandon Li commented on HDFS-4750: -- {quote}...but NFS does require that close() makes it visible to other clients (so-called close-to-open consistency){quote} The protocol provides no facility to guarantee the cached data is consistent with that on server. But close-to-open consistency is recommended for implementation. Support NFSv3 interface to HDFS --- Key: HDFS-4750 URL: https://issues.apache.org/jira/browse/HDFS-4750 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HADOOP-NFS-Proposal.pdf Access HDFS is usually done through HDFS Client or webHDFS. Lack of seamless integration with client’s file system makes it difficult for users and impossible for some applications to access HDFS. NFS interface support is one way for HDFS to have such easy integration. This JIRA is to track the NFS protocol support for accessing HDFS. With HDFS client, webHDFS and the NFS interface, HDFS will be easier to access and be able support more applications and use cases. We will upload the design document and the initial implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4650) Add rename test in TestSnapshot
[ https://issues.apache.org/jira/browse/HDFS-4650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-4650: Attachment: HDFS-4650.001.patch We have already added a set of unit tests in TestRenameWithSnapshots. This patch adds rename operation to TestSnapshot and fixes a bug. Two more unit tests are added to cover two cases where snapshot deletion happens after rename operation(s). Add rename test in TestSnapshot --- Key: HDFS-4650 URL: https://issues.apache.org/jira/browse/HDFS-4650 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Jing Zhao Assignee: Jing Zhao Attachments: HDFS-4650.001.patch Add more unit tests and update current unit tests to cover different cases for rename with existence of snapshottable directories and snapshots. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4489) Use InodeID as as an identifier of a file in HDFS protocols and APIs
[ https://issues.apache.org/jira/browse/HDFS-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642185#comment-13642185 ] Suresh Srinivas commented on HDFS-4489: --- bq. My concern is that you committed incompatible change Konstantin, not sure if you looked at the release notes. This change disallows a file or directory name called .reserved under root. That is the only reason why I marked it as incompatible. This is not related to wire or API incompatibility. That said, one of the goal for 2.0.5 is drive towards a state where incompatible changes are not allowed after it. bq. which is a new feature and a large change, into the stabilization branch without a vote or a release plan discussed with the community. I agree that this is a new features. Committers routinely promote changes that they consider are okay to branch-2. I believe this does not add to the instability. Let me know if you disagree based on a code review/testing. Also merging to branch-2 in a lot of cases is done based on a committer's judgement. Please look various other jiras that are merged in without vote thread into branch-2. I do not consider this as a large feature. However for Snapshot feature, I would have brought up that in a release thread. bq. . And you didn't give any reasons for the merge. I think there is enough motivation for the feature posted in the jira. bq. I would like to ask to revert this merge from branch 2.0.5 and follow the procedures for merging features into new release branches if you decide to proceed. I have spent more than 12 hours merging the chain of jiras required and resolving conflict before getting to 4 changes that introduced file id. Is your concern about HDFS-4434 or all the related changes? Use InodeID as as an identifier of a file in HDFS protocols and APIs Key: HDFS-4489 URL: https://issues.apache.org/jira/browse/HDFS-4489 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Brandon Li Assignee: Brandon Li Fix For: 2.0.5-beta The benefit of using InodeID to uniquely identify a file can be multiple folds. Here are a few of them: 1. uniquely identify a file cross rename, related JIRAs include HDFS-4258, HDFS-4437. 2. modification checks in tools like distcp. Since a file could have been replaced or renamed to, the file name and size combination is no t reliable, but the combination of file id and size is unique. 3. id based protocol support (e.g., NFS) 4. to make the pluggable block placement policy use fileid instead of filename (HDFS-385). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2576) Namenode should have a favored nodes hint to enable clients to have control over block placement.
[ https://issues.apache.org/jira/browse/HDFS-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642189#comment-13642189 ] Tsz Wo (Nicholas), SZE commented on HDFS-2576: -- - In BlockPlacementPolicyDefault, it should use the oldExcludedNodes when choosing the remaining nodes. Also, what should happen if a node is in both favoredNodes and excludedNodes? - In BlockPlacementPolicy and BlockPlacementPolicyDefault, use Collections.DatanodeDescriptoremptyList() instead of new ArrayListDatanodeDescriptor(). - Change void resolveNetworkLocation(DatanodeDescriptor) to String resolveNetworkLocation(DatanodeID). Then, we only need to create DatanodeID but not DatanodeDescriptor in the new resolveNetworkLocation(String) method. - Move getDatanodeDescriptor(String address) from BlockManager to DatanodeManager. Namenode should have a favored nodes hint to enable clients to have control over block placement. - Key: HDFS-2576 URL: https://issues.apache.org/jira/browse/HDFS-2576 Project: Hadoop HDFS Issue Type: New Feature Reporter: Pritam Damania Assignee: Devaraj Das Fix For: 2.0.5-beta Attachments: hdfs-2576-1.txt, hdfs-2576-trunk-1.patch, hdfs-2576-trunk-2.patch, hdfs-2576-trunk-7.1.patch, hdfs-2576-trunk-7.patch, hdfs-2576-trunk-8.1.patch, hdfs-2576-trunk-8.patch Sometimes Clients like HBase are required to dynamically compute the datanodes it wishes to place the blocks for a file for higher level of locality. For this purpose there is a need of a way to give the Namenode a hint in terms of a favoredNodes parameter about the locations where the client wants to put each block. The proposed solution is a favored nodes parameter in the addBlock() method and in the create() file method to enable the clients to give the hints to the NameNode about the locations of each replica of the block. Note that this would be just a hint and finally the NameNode would look at disk usage, datanode load etc. and decide whether it can respect the hints or not. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4756) Implement ONCRPC and XDR
Brandon Li created HDFS-4756: Summary: Implement ONCRPC and XDR Key: HDFS-4756 URL: https://issues.apache.org/jira/browse/HDFS-4756 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Brandon Li This is to track the implementation of ONCRPC(rfc5531) and XDR(rfc4506). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4610) Move to using common utils FileUtil#setReadable/Writable/Executable and FileUtil#canRead/Write/Execute
[ https://issues.apache.org/jira/browse/HDFS-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642203#comment-13642203 ] Chris Nauroth commented on HDFS-4610: - The changes look good, Ivan. I tested this in combination with the latest patches for HADOOP-9290, HADOOP-9413, HADOOP-9490, and HDFS-4705. I was expecting all of these patches in combination to make the following tests pass on Windows: {{TestCheckpoint}} (mostly), {{TestNNStorageRetentionFunctional}}, and {{TestStorageRestore}}. They still failed though with the same errors that make it look like marking directories as bad didn't work as expected. I verified that hadoop.dll was loaded successfully for each of the tests. Were you expecting the patch to fix these tests? Do they pass in your environment now? Move to using common utils FileUtil#setReadable/Writable/Executable and FileUtil#canRead/Write/Execute -- Key: HDFS-4610 URL: https://issues.apache.org/jira/browse/HDFS-4610 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Ivan Mitic Assignee: Ivan Mitic Attachments: HDFS-4610.commonfileutils.2.patch, HDFS-4610.commonfileutils.patch Switch to using common utils described in HADOOP-9413 that work well cross-platform. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4610) Move to using common utils FileUtil#setReadable/Writable/Executable and FileUtil#canRead/Write/Execute
[ https://issues.apache.org/jira/browse/HDFS-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642211#comment-13642211 ] Arpit Agarwal commented on HDFS-4610: - Chris, did you try to apply any specifically patches to those three tests? I don't think the FileUtil changes by themselves are sufficient right now. Move to using common utils FileUtil#setReadable/Writable/Executable and FileUtil#canRead/Write/Execute -- Key: HDFS-4610 URL: https://issues.apache.org/jira/browse/HDFS-4610 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Ivan Mitic Assignee: Ivan Mitic Attachments: HDFS-4610.commonfileutils.2.patch, HDFS-4610.commonfileutils.patch Switch to using common utils described in HADOOP-9413 that work well cross-platform. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4489) Use InodeID as as an identifier of a file in HDFS protocols and APIs
[ https://issues.apache.org/jira/browse/HDFS-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642212#comment-13642212 ] Nathan Roberts commented on HDFS-4489: -- Sorry this is a really late comment but I'd really like to see some performance numbers before and after. While 6.5% increase in overall heap size is not massive, my main concern is the 25% increase in a very core data structure within the NN (1.07G-1.37G in Todd's measurement of INodeFile). This could cause significant cache pollution and therefore could have a very measurable impact on performance. I don't know for sure that it will, but it seems it would be reasonable to verify. Use InodeID as as an identifier of a file in HDFS protocols and APIs Key: HDFS-4489 URL: https://issues.apache.org/jira/browse/HDFS-4489 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Brandon Li Assignee: Brandon Li Fix For: 2.0.5-beta The benefit of using InodeID to uniquely identify a file can be multiple folds. Here are a few of them: 1. uniquely identify a file cross rename, related JIRAs include HDFS-4258, HDFS-4437. 2. modification checks in tools like distcp. Since a file could have been replaced or renamed to, the file name and size combination is no t reliable, but the combination of file id and size is unique. 3. id based protocol support (e.g., NFS) 4. to make the pluggable block placement policy use fileid instead of filename (HDFS-385). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4757) Update FSDirectory#inodeMap when replacing an INodeDirectory while setting quota
Jing Zhao created HDFS-4757: --- Summary: Update FSDirectory#inodeMap when replacing an INodeDirectory while setting quota Key: HDFS-4757 URL: https://issues.apache.org/jira/browse/HDFS-4757 Project: Hadoop HDFS Issue Type: Bug Reporter: Jing Zhao Assignee: Jing Zhao Priority: Minor When setting quota to a directory, we may need to replace the original directory node with a new node with the same id. We need to update the inodeMap after the node replacement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4610) Move to using common utils FileUtil#setReadable/Writable/Executable and FileUtil#canRead/Write/Execute
[ https://issues.apache.org/jira/browse/HDFS-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642217#comment-13642217 ] Chris Nauroth commented on HDFS-4610: - {quote} I don't think the FileUtil changes by themselves are sufficient right now. {quote} Oh, that's right. {{TestStorageRestore}} is tracked separately in HDFS-4741. I do see that this patch is making changes in {{TestCheckpoint}} and {{TestNNStorageRetentionFunctional}} though. Ivan, can you clarify if this patch makes these 2 tests pass for you? Move to using common utils FileUtil#setReadable/Writable/Executable and FileUtil#canRead/Write/Execute -- Key: HDFS-4610 URL: https://issues.apache.org/jira/browse/HDFS-4610 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Ivan Mitic Assignee: Ivan Mitic Attachments: HDFS-4610.commonfileutils.2.patch, HDFS-4610.commonfileutils.patch Switch to using common utils described in HADOOP-9413 that work well cross-platform. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4757) Update FSDirectory#inodeMap when replacing an INodeDirectory while setting quota
[ https://issues.apache.org/jira/browse/HDFS-4757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-4757: Affects Version/s: 3.0.0 Status: Patch Available (was: Open) Update FSDirectory#inodeMap when replacing an INodeDirectory while setting quota Key: HDFS-4757 URL: https://issues.apache.org/jira/browse/HDFS-4757 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Jing Zhao Assignee: Jing Zhao Priority: Minor Attachments: HDFS-4757.001.patch When setting quota to a directory, we may need to replace the original directory node with a new node with the same id. We need to update the inodeMap after the node replacement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4757) Update FSDirectory#inodeMap when replacing an INodeDirectory while setting quota
[ https://issues.apache.org/jira/browse/HDFS-4757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-4757: Attachment: HDFS-4757.001.patch A simple patch with unit test. Update FSDirectory#inodeMap when replacing an INodeDirectory while setting quota Key: HDFS-4757 URL: https://issues.apache.org/jira/browse/HDFS-4757 Project: Hadoop HDFS Issue Type: Bug Reporter: Jing Zhao Assignee: Jing Zhao Priority: Minor Attachments: HDFS-4757.001.patch When setting quota to a directory, we may need to replace the original directory node with a new node with the same id. We need to update the inodeMap after the node replacement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4489) Use InodeID as as an identifier of a file in HDFS protocols and APIs
[ https://issues.apache.org/jira/browse/HDFS-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642225#comment-13642225 ] Suresh Srinivas commented on HDFS-4489: --- [~nroberts] What performance test would like to be run with and without this change? Use InodeID as as an identifier of a file in HDFS protocols and APIs Key: HDFS-4489 URL: https://issues.apache.org/jira/browse/HDFS-4489 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Brandon Li Assignee: Brandon Li Fix For: 2.0.5-beta The benefit of using InodeID to uniquely identify a file can be multiple folds. Here are a few of them: 1. uniquely identify a file cross rename, related JIRAs include HDFS-4258, HDFS-4437. 2. modification checks in tools like distcp. Since a file could have been replaced or renamed to, the file name and size combination is no t reliable, but the combination of file id and size is unique. 3. id based protocol support (e.g., NFS) 4. to make the pluggable block placement policy use fileid instead of filename (HDFS-385). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HDFS-4489) Use InodeID as as an identifier of a file in HDFS protocols and APIs
[ https://issues.apache.org/jira/browse/HDFS-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642225#comment-13642225 ] Suresh Srinivas edited comment on HDFS-4489 at 4/25/13 9:14 PM: [~nroberts] What performance test would you like to see run with and without this change? NNBench? was (Author: sureshms): [~nroberts] What performance test would like to be run with and without this change? Use InodeID as as an identifier of a file in HDFS protocols and APIs Key: HDFS-4489 URL: https://issues.apache.org/jira/browse/HDFS-4489 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Brandon Li Assignee: Brandon Li Fix For: 2.0.5-beta The benefit of using InodeID to uniquely identify a file can be multiple folds. Here are a few of them: 1. uniquely identify a file cross rename, related JIRAs include HDFS-4258, HDFS-4437. 2. modification checks in tools like distcp. Since a file could have been replaced or renamed to, the file name and size combination is no t reliable, but the combination of file id and size is unique. 3. id based protocol support (e.g., NFS) 4. to make the pluggable block placement policy use fileid instead of filename (HDFS-385). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4758) Disallow nested snapshottable directories
Tsz Wo (Nicholas), SZE created HDFS-4758: Summary: Disallow nested snapshottable directories Key: HDFS-4758 URL: https://issues.apache.org/jira/browse/HDFS-4758 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Nested snapshottable directories are supported by the current implementation. However, it seems that there are no good use cases for nested snapshottable directories. So we disable it for now until someone has a valid use case for it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4758) Disallow nested snapshottable directories
[ https://issues.apache.org/jira/browse/HDFS-4758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642234#comment-13642234 ] Todd Lipcon commented on HDFS-4758: --- Based on the discussion at the contributors meetup a few months back, didn't a lot of us anticipate the use case of a snapshottable /user, and then allowing each user or application to then make snapshots of their own /user/foo? Or a snapshottable root directory and then snapshottable subdirs? Disallow nested snapshottable directories - Key: HDFS-4758 URL: https://issues.apache.org/jira/browse/HDFS-4758 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Nested snapshottable directories are supported by the current implementation. However, it seems that there are no good use cases for nested snapshottable directories. So we disable it for now until someone has a valid use case for it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HDFS-4489) Use InodeID as as an identifier of a file in HDFS protocols and APIs
[ https://issues.apache.org/jira/browse/HDFS-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas reopened HDFS-4489: --- Use InodeID as as an identifier of a file in HDFS protocols and APIs Key: HDFS-4489 URL: https://issues.apache.org/jira/browse/HDFS-4489 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Brandon Li Assignee: Brandon Li Fix For: 2.0.5-beta The benefit of using InodeID to uniquely identify a file can be multiple folds. Here are a few of them: 1. uniquely identify a file cross rename, related JIRAs include HDFS-4258, HDFS-4437. 2. modification checks in tools like distcp. Since a file could have been replaced or renamed to, the file name and size combination is no t reliable, but the combination of file id and size is unique. 3. id based protocol support (e.g., NFS) 4. to make the pluggable block placement policy use fileid instead of filename (HDFS-385). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4489) Use InodeID as as an identifier of a file in HDFS protocols and APIs
[ https://issues.apache.org/jira/browse/HDFS-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642242#comment-13642242 ] Suresh Srinivas commented on HDFS-4489: --- I have reverted HDFS-4434 from branch-2. Will post the performance numbers and then commit the change to branch-2, based on that discussion. Use InodeID as as an identifier of a file in HDFS protocols and APIs Key: HDFS-4489 URL: https://issues.apache.org/jira/browse/HDFS-4489 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Brandon Li Assignee: Brandon Li Fix For: 2.0.5-beta The benefit of using InodeID to uniquely identify a file can be multiple folds. Here are a few of them: 1. uniquely identify a file cross rename, related JIRAs include HDFS-4258, HDFS-4437. 2. modification checks in tools like distcp. Since a file could have been replaced or renamed to, the file name and size combination is no t reliable, but the combination of file id and size is unique. 3. id based protocol support (e.g., NFS) 4. to make the pluggable block placement policy use fileid instead of filename (HDFS-385). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HDFS-4434) Provide a mapping from INodeId to INode
[ https://issues.apache.org/jira/browse/HDFS-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas reopened HDFS-4434: --- Provide a mapping from INodeId to INode --- Key: HDFS-4434 URL: https://issues.apache.org/jira/browse/HDFS-4434 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Suresh Srinivas Fix For: 3.0.0, 2.0.5-beta Attachments: HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch This JIRA is to provide a way to access the INode via its id. The proposed solution is to have an in-memory mapping from INodeId to INode. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4434) Provide a mapping from INodeId to INode
[ https://issues.apache.org/jira/browse/HDFS-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-4434: -- Fix Version/s: (was: 2.0.5-beta) Provide a mapping from INodeId to INode --- Key: HDFS-4434 URL: https://issues.apache.org/jira/browse/HDFS-4434 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Suresh Srinivas Fix For: 3.0.0 Attachments: HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch, HDFS-4434.patch This JIRA is to provide a way to access the INode via its id. The proposed solution is to have an in-memory mapping from INodeId to INode. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4758) Disallow nested snapshottable directories
[ https://issues.apache.org/jira/browse/HDFS-4758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642244#comment-13642244 ] Tsz Wo (Nicholas), SZE commented on HDFS-4758: -- For your first example, we may set snapshottable for all users. Then taking snapshots of /user can be done by taking snapshots for all user dirs. For root, I think we should not encourage taking snapshots since all the tmp files will sit in a snapshot forever. Instead, we should encourage taking sanpshots in subdirectories. It seems that nested snapshottable directories bring only a little convenience with a lot of inefficiencies. What do you think? Disallow nested snapshottable directories - Key: HDFS-4758 URL: https://issues.apache.org/jira/browse/HDFS-4758 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Nested snapshottable directories are supported by the current implementation. However, it seems that there are no good use cases for nested snapshottable directories. So we disable it for now until someone has a valid use case for it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4758) Disallow nested snapshottable directories
[ https://issues.apache.org/jira/browse/HDFS-4758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642256#comment-13642256 ] Todd Lipcon commented on HDFS-4758: --- bq. For your first example, we may set snapshottable for all users. Then taking snapshots of /user can be done by taking snapshots for all user dirs. But then the snapshot is not atomic across users, which makes it difficult to do a point in time backup of a cluster by snapshot+distcp bq. For root, I think we should not encourage taking snapshots since all the tmp files will sit in a snapshot forever. Instead, we should encourage taking sanpshots in subdirectories. A common use case for snapshots is a short-lived snapshot which is then used as the source for distcp. The distcp can explicitly exclude copying tmp files. Once the distcp is complete, then the snapshot can be removed, so the space usage of the tmp files is only temporary. Disallow nested snapshottable directories - Key: HDFS-4758 URL: https://issues.apache.org/jira/browse/HDFS-4758 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Nested snapshottable directories are supported by the current implementation. However, it seems that there are no good use cases for nested snapshottable directories. So we disable it for now until someone has a valid use case for it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4758) Disallow nested snapshottable directories
[ https://issues.apache.org/jira/browse/HDFS-4758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642270#comment-13642270 ] Hari Mankude commented on HDFS-4758: Actually one use case of nested snapshots that I see is that user might have different backup policies for /user (once every day) and /user/hive (every 8 hrs). When backing up /user, it is possible to setup exclusions of /user/hive directory so that two copies of /user/hive is not made. However, if snapshots cannot be taken of /user and /user/hive at the same time, it would be a disadvantage. Disallow nested snapshottable directories - Key: HDFS-4758 URL: https://issues.apache.org/jira/browse/HDFS-4758 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Nested snapshottable directories are supported by the current implementation. However, it seems that there are no good use cases for nested snapshottable directories. So we disable it for now until someone has a valid use case for it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4758) Disallow nested snapshottable directories
[ https://issues.apache.org/jira/browse/HDFS-4758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642280#comment-13642280 ] Suresh Srinivas commented on HDFS-4758: --- The current branch does support nested snapshots. The reason why this is being proposed is to provide limited functionality early on and if that suffices remove complexity associated with supporting and testing nested snapshots at arbitrary levels. If we support nested snapshots now, it will not be possible to take it away in the future. In my discussion with Jing and Nicholas we had considered the following: # Allow snapshot of / as a special case and that is the only snapshottable directory with in which another snapshottable directory is allowed. # Disallow nested snapshots in the first phase. See how how this is used by the users and enable it either as stated in the first bullet or remove the restriction altogether. If the current restrictions takes away some use cases, so be it. Lets turn it back on later if we cannot live without it. Disallow nested snapshottable directories - Key: HDFS-4758 URL: https://issues.apache.org/jira/browse/HDFS-4758 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Nested snapshottable directories are supported by the current implementation. However, it seems that there are no good use cases for nested snapshottable directories. So we disable it for now until someone has a valid use case for it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4489) Use InodeID as as an identifier of a file in HDFS protocols and APIs
[ https://issues.apache.org/jira/browse/HDFS-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642290#comment-13642290 ] Sanjay Radia commented on HDFS-4489: Nathan. A question. Suresh is willing to do the performance benchmark, but I am trying to understand where you are coming from. Yahoo and FB create very large namespaces by simply buying more memory and increasing the size of the heap. Do you worry about cache pollution when you create 50K more files? Given that the NN heap (many GBs) is so much larger than the cache, does the additional inode and inode-map size impact the overall system performance? Suresh has argued that a 24GB heap grows by 625MB. Looking at the growth in memory of this feature as a percentage of the total heap size is a more realistic way of looking at the impact of the growth than the growth of an individual data structure like the inode. IMHO, not having an inode-map and inode number was a serious limitation in the original implementation of NN. I am willing to pay for the extra memory given the value inode-id and inode-map brings (as described by suresh in the beginning of this Jira). Permissions, access time, etc added to the memory cost of the the NN and were accepted because of the value they bring. Use InodeID as as an identifier of a file in HDFS protocols and APIs Key: HDFS-4489 URL: https://issues.apache.org/jira/browse/HDFS-4489 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Brandon Li Assignee: Brandon Li Fix For: 2.0.5-beta The benefit of using InodeID to uniquely identify a file can be multiple folds. Here are a few of them: 1. uniquely identify a file cross rename, related JIRAs include HDFS-4258, HDFS-4437. 2. modification checks in tools like distcp. Since a file could have been replaced or renamed to, the file name and size combination is no t reliable, but the combination of file id and size is unique. 3. id based protocol support (e.g., NFS) 4. to make the pluggable block placement policy use fileid instead of filename (HDFS-385). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4758) Disallow nested snapshottable directories
[ https://issues.apache.org/jira/browse/HDFS-4758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642316#comment-13642316 ] Hari Mankude commented on HDFS-4758: The trade-off is between usability vs complexity. In this case, it might result in issues where a user has taken a snapshot of /user/foo/dir1 and admin finds that system-wide snaps cannot be taken at say /user dir levels since there are several users with their snapshots at lower directories. This might limit the usability of the feature. Disallow nested snapshottable directories - Key: HDFS-4758 URL: https://issues.apache.org/jira/browse/HDFS-4758 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Nested snapshottable directories are supported by the current implementation. However, it seems that there are no good use cases for nested snapshottable directories. So we disable it for now until someone has a valid use case for it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4610) Move to using common utils FileUtil#setReadable/Writable/Executable and FileUtil#canRead/Write/Execute
[ https://issues.apache.org/jira/browse/HDFS-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642326#comment-13642326 ] Arpit Agarwal commented on HDFS-4610: - It looks like recursive permissions (winutils chmod -R) is broken on Windows. This is breaking TestStorageRestore. I filed HADOOP-9508 and attached a test case to repro the issue. Move to using common utils FileUtil#setReadable/Writable/Executable and FileUtil#canRead/Write/Execute -- Key: HDFS-4610 URL: https://issues.apache.org/jira/browse/HDFS-4610 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Ivan Mitic Assignee: Ivan Mitic Attachments: HDFS-4610.commonfileutils.2.patch, HDFS-4610.commonfileutils.patch Switch to using common utils described in HADOOP-9413 that work well cross-platform. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4758) Disallow nested snapshottable directories
[ https://issues.apache.org/jira/browse/HDFS-4758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642328#comment-13642328 ] Jing Zhao commented on HDFS-4758: - bq. it might result in issues where a user has taken a snapshot of /user/foo/dir1 and admin finds that system-wide snaps cannot be taken at say /user dir levels since there are several users with their snapshots at lower directories. This may not be an issue because only admin can allow snapshot on a directory (i.e., to convert a directory to a snapshottable directory). The admin can specify a set of snapshottable directories in the beginning under which no more snapshottable directory will be allowed in the future. Disallow nested snapshottable directories - Key: HDFS-4758 URL: https://issues.apache.org/jira/browse/HDFS-4758 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Nested snapshottable directories are supported by the current implementation. However, it seems that there are no good use cases for nested snapshottable directories. So we disable it for now until someone has a valid use case for it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4741) TestStorageRestore#testStorageRestoreFailure fails on Windows
[ https://issues.apache.org/jira/browse/HDFS-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642339#comment-13642339 ] Arpit Agarwal commented on HDFS-4741: - The main cause of the failure appears to be winutils chmod -R being broken on Windows. Filed and linked HADOOP-9508. If the winutils fix by itself is sufficient to resolve this test failure I will resolve it as a duplicate. TestStorageRestore#testStorageRestoreFailure fails on Windows - Key: HDFS-4741 URL: https://issues.apache.org/jira/browse/HDFS-4741 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 3.0.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Fix For: 3.0.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4757) Update FSDirectory#inodeMap when replacing an INodeDirectory while setting quota
[ https://issues.apache.org/jira/browse/HDFS-4757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642341#comment-13642341 ] Hadoop QA commented on HDFS-4757: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12580589/HDFS-4757.001.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4319//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4319//console This message is automatically generated. Update FSDirectory#inodeMap when replacing an INodeDirectory while setting quota Key: HDFS-4757 URL: https://issues.apache.org/jira/browse/HDFS-4757 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Jing Zhao Assignee: Jing Zhao Priority: Minor Attachments: HDFS-4757.001.patch When setting quota to a directory, we may need to replace the original directory node with a new node with the same id. We need to update the inodeMap after the node replacement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4758) Disallow nested snapshottable directories
[ https://issues.apache.org/jira/browse/HDFS-4758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642352#comment-13642352 ] Tsz Wo (Nicholas), SZE commented on HDFS-4758: -- But then the snapshot is not atomic across users, ... It is easy to change the API to support it. Disallow nested snapshottable directories - Key: HDFS-4758 URL: https://issues.apache.org/jira/browse/HDFS-4758 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Nested snapshottable directories are supported by the current implementation. However, it seems that there are no good use cases for nested snapshottable directories. So we disable it for now until someone has a valid use case for it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4489) Use InodeID as as an identifier of a file in HDFS protocols and APIs
[ https://issues.apache.org/jira/browse/HDFS-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642355#comment-13642355 ] Konstantin Shvachko commented on HDFS-4489: --- Suresh whatever reason for incompatibility it should go through approval process. You also committed the LayoutVersion change HDFS-4296. Now it requires an upgrade. one of the goal for 2.0.5 is drive towards a state where incompatible changes are not allowed after it. That was the goal for Hadoop 0.20. I thought the goal for 2.0.5 is stabilization. Also merging to branch-2 in a lot of cases is done based on a committer's judgement. I think it is wrong. Especially for the stabilization release. I think there is enough motivation for the feature posted in the jira. Not arguing about the value of the feature. But about its necessity for 2.0.5 Is your concern about HDFS-4434 or all the related changes? Most of them. I would have reviewed if I had a proper warning. So again why is it necessary for 2.0.5? Use InodeID as as an identifier of a file in HDFS protocols and APIs Key: HDFS-4489 URL: https://issues.apache.org/jira/browse/HDFS-4489 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Brandon Li Assignee: Brandon Li Fix For: 2.0.5-beta The benefit of using InodeID to uniquely identify a file can be multiple folds. Here are a few of them: 1. uniquely identify a file cross rename, related JIRAs include HDFS-4258, HDFS-4437. 2. modification checks in tools like distcp. Since a file could have been replaced or renamed to, the file name and size combination is no t reliable, but the combination of file id and size is unique. 3. id based protocol support (e.g., NFS) 4. to make the pluggable block placement policy use fileid instead of filename (HDFS-385). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4305) Add a configurable limit on number of blocks per file, and min block size
[ https://issues.apache.org/jira/browse/HDFS-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642421#comment-13642421 ] Andrew Wang commented on HDFS-4305: --- I believe the test failure is unrelated. It's a null pointer in MiniDFSCluster#shutdown (so harmless), and it didn't happen when I ran the test a few times locally. Add a configurable limit on number of blocks per file, and min block size - Key: HDFS-4305 URL: https://issues.apache.org/jira/browse/HDFS-4305 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 1.0.4, 2.0.4-alpha Reporter: Todd Lipcon Assignee: Andrew Wang Priority: Minor Attachments: hdfs-4305-1.patch, hdfs-4305-2.patch, hdfs-4305-3.patch We recently had an issue where a user set the block size very very low and managed to create a single file with hundreds of thousands of blocks. This caused problems with the edit log since the OP_ADD op was so large (HDFS-4304). I imagine it could also cause efficiency issues in the NN. To prevent users from making such mistakes, we should: - introduce a configurable minimum block size, below which requests are rejected - introduce a configurable maximum number of blocks per file, above which requests to add another block are rejected (with a suitably high default as to not prevent legitimate large files) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4755) AccessControlException message is changed in snapshot branch
[ https://issues.apache.org/jira/browse/HDFS-4755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-4755: - Attachment: h4755_20130425.patch h4755_20130425.patch: - changes the message; - also, moves implements LinkedElement from INode to INodeWithAdditionalFields. AccessControlException message is changed in snapshot branch Key: HDFS-4755 URL: https://issues.apache.org/jira/browse/HDFS-4755 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Priority: Minor Attachments: h4755_20130425.patch [~rramya] observed the following - Trunk: mkdir: org.apache.hadoop.security.AccessControlException: Permission denied: user=hrt_qa, access=WRITE, inode=hdfs:hdfs:hdfs:rwx-x-x - Snapshot branch: mkdir: org.apache.hadoop.security.AccessControlException: Permission denied: user=hrt_qa, access=WRITE, inode=/user/hdfs -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4757) Update FSDirectory#inodeMap when replacing an INodeDirectory while setting quota
[ https://issues.apache.org/jira/browse/HDFS-4757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-4757: - Component/s: namenode Hadoop Flags: Reviewed +1 patch looks good. Update FSDirectory#inodeMap when replacing an INodeDirectory while setting quota Key: HDFS-4757 URL: https://issues.apache.org/jira/browse/HDFS-4757 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 3.0.0 Reporter: Jing Zhao Assignee: Jing Zhao Priority: Minor Attachments: HDFS-4757.001.patch When setting quota to a directory, we may need to replace the original directory node with a new node with the same id. We need to update the inodeMap after the node replacement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4755) AccessControlException message is changed in snapshot branch
[ https://issues.apache.org/jira/browse/HDFS-4755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642448#comment-13642448 ] Jing Zhao commented on HDFS-4755: - +1 for the patch. AccessControlException message is changed in snapshot branch Key: HDFS-4755 URL: https://issues.apache.org/jira/browse/HDFS-4755 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Priority: Minor Attachments: h4755_20130425.patch [~rramya] observed the following - Trunk: mkdir: org.apache.hadoop.security.AccessControlException: Permission denied: user=hrt_qa, access=WRITE, inode=hdfs:hdfs:hdfs:rwx-x-x - Snapshot branch: mkdir: org.apache.hadoop.security.AccessControlException: Permission denied: user=hrt_qa, access=WRITE, inode=/user/hdfs -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4757) Update FSDirectory#inodeMap when replacing an INodeDirectory while setting quota
[ https://issues.apache.org/jira/browse/HDFS-4757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-4757: - Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) I have committed this. Thanks, Jing! Update FSDirectory#inodeMap when replacing an INodeDirectory while setting quota Key: HDFS-4757 URL: https://issues.apache.org/jira/browse/HDFS-4757 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 3.0.0 Reporter: Jing Zhao Assignee: Jing Zhao Priority: Minor Fix For: 3.0.0 Attachments: HDFS-4757.001.patch When setting quota to a directory, we may need to replace the original directory node with a new node with the same id. We need to update the inodeMap after the node replacement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4757) Update FSDirectory#inodeMap when replacing an INodeDirectory while setting quota
[ https://issues.apache.org/jira/browse/HDFS-4757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642453#comment-13642453 ] Hudson commented on HDFS-4757: -- Integrated in Hadoop-trunk-Commit #3667 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/3667/]) HDFS-4757. Update FSDirectory#inodeMap when replacing an INodeDirectory while setting quota. Contributed by Jing Zhao (Revision 1476005) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1476005 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestINodeFile.java Update FSDirectory#inodeMap when replacing an INodeDirectory while setting quota Key: HDFS-4757 URL: https://issues.apache.org/jira/browse/HDFS-4757 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 3.0.0 Reporter: Jing Zhao Assignee: Jing Zhao Priority: Minor Fix For: 3.0.0 Attachments: HDFS-4757.001.patch When setting quota to a directory, we may need to replace the original directory node with a new node with the same id. We need to update the inodeMap after the node replacement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4489) Use InodeID as as an identifier of a file in HDFS protocols and APIs
[ https://issues.apache.org/jira/browse/HDFS-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642455#comment-13642455 ] Suresh Srinivas commented on HDFS-4489: --- {quote} That was the goal for Hadoop 0.20. I thought the goal for 2.0.5 is stabilization. {quote} I am not sure if 0.20 is a typo. If it is not, I have hard time parsing that statement. See the previous discussion about 2.0.4-beta (now called 2.0.5) in this thread: http://hadoop.markmail.org/thread/v44nqp466p76jpkj bq. I think it is wrong. Especially for the stabilization release. I disagree. I want to get some of the features I have been working on into this release. I think the goal of this release is to get API and wire compatibility stable. bq. Most of them. I would have reviewed if I had a proper warning. I am not sure what kind of warning you are talking about. HDFS-4434 has been in development for a long time with more than 32 iterations of the patch. bq. So again why is it necessary for 2.0.5? Snapshot and NFS feature depends on this. I would like see it become available in 2.0.5. Use InodeID as as an identifier of a file in HDFS protocols and APIs Key: HDFS-4489 URL: https://issues.apache.org/jira/browse/HDFS-4489 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Brandon Li Assignee: Brandon Li Fix For: 2.0.5-beta The benefit of using InodeID to uniquely identify a file can be multiple folds. Here are a few of them: 1. uniquely identify a file cross rename, related JIRAs include HDFS-4258, HDFS-4437. 2. modification checks in tools like distcp. Since a file could have been replaced or renamed to, the file name and size combination is no t reliable, but the combination of file id and size is unique. 3. id based protocol support (e.g., NFS) 4. to make the pluggable block placement policy use fileid instead of filename (HDFS-385). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4759) snapshotDiff of two invalid snapshots but with same name returns success
Ramya Sunil created HDFS-4759: - Summary: snapshotDiff of two invalid snapshots but with same name returns success Key: HDFS-4759 URL: https://issues.apache.org/jira/browse/HDFS-4759 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Ramya Sunil Fix For: Snapshot (HDFS-2802) snapshotDiff of two invalid snapshots which have the same names returns a success. $ hadoop dfs -ls /user/foo/hdfs-snapshots/.snapshot Found 1 items drwx-- - foo foo 0 2013-04-26 00:53 /user/foo/hdfs-snapshots/.snapshot/s1 $ hadoop snapshotDiff /user/foo/hdfs-snapshots invalid invalid Difference between snapshot invalid and snapshot invalid under directory /user/foo/hdfs-snapshots: -bash-4.1$ echo $? 0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4756) Implement ONCRPC and XDR
[ https://issues.apache.org/jira/browse/HDFS-4756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-4756: - Issue Type: New Feature (was: Sub-task) Parent: (was: HDFS-4750) Implement ONCRPC and XDR Key: HDFS-4756 URL: https://issues.apache.org/jira/browse/HDFS-4756 Project: Hadoop HDFS Issue Type: New Feature Reporter: Brandon Li This is to track the implementation of ONCRPC(rfc5531) and XDR(rfc4506). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-4755) AccessControlException message is changed in snapshot branch
[ https://issues.apache.org/jira/browse/HDFS-4755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE resolved HDFS-4755. -- Resolution: Fixed Fix Version/s: Snapshot (HDFS-2802) Hadoop Flags: Reviewed Thanks Jing for reviewing the patch. I have committed this. AccessControlException message is changed in snapshot branch Key: HDFS-4755 URL: https://issues.apache.org/jira/browse/HDFS-4755 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Priority: Minor Fix For: Snapshot (HDFS-2802) Attachments: h4755_20130425.patch [~rramya] observed the following - Trunk: mkdir: org.apache.hadoop.security.AccessControlException: Permission denied: user=hrt_qa, access=WRITE, inode=hdfs:hdfs:hdfs:rwx-x-x - Snapshot branch: mkdir: org.apache.hadoop.security.AccessControlException: Permission denied: user=hrt_qa, access=WRITE, inode=/user/hdfs -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HDFS-4759) snapshotDiff of two invalid snapshots but with same name returns success
[ https://issues.apache.org/jira/browse/HDFS-4759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao reassigned HDFS-4759: --- Assignee: Jing Zhao snapshotDiff of two invalid snapshots but with same name returns success Key: HDFS-4759 URL: https://issues.apache.org/jira/browse/HDFS-4759 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Ramya Sunil Assignee: Jing Zhao Fix For: Snapshot (HDFS-2802) snapshotDiff of two invalid snapshots which have the same names returns a success. $ hadoop dfs -ls /user/foo/hdfs-snapshots/.snapshot Found 1 items drwx-- - foo foo 0 2013-04-26 00:53 /user/foo/hdfs-snapshots/.snapshot/s1 $ hadoop snapshotDiff /user/foo/hdfs-snapshots invalid invalid Difference between snapshot invalid and snapshot invalid under directory /user/foo/hdfs-snapshots: -bash-4.1$ echo $? 0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4760) Update inodeMap after node replacement
Jing Zhao created HDFS-4760: --- Summary: Update inodeMap after node replacement Key: HDFS-4760 URL: https://issues.apache.org/jira/browse/HDFS-4760 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao Assignee: Jing Zhao Similar with HDFS-4757, we need to update the inodeMap after node replacement. Because a lot of node replacement happens in the snapshot branch (e.g., INodeDirectory - INodeDirectoryWithSnapshot, INodeDirectory - INodeDirectorySnapshottable, INodeFile - INodeFileWithSnapshot ...), this becomes a non-trivial issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4760) Update inodeMap after node replacement
[ https://issues.apache.org/jira/browse/HDFS-4760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-4760: Description: Similar with HDFS-4757, we need to update the inodeMap after node replacement. Because a lot of node replacement happens in the snapshot branch (e.g., INodeDirectory = INodeDirectoryWithSnapshot, INodeDirectory = INodeDirectorySnapshottable, INodeFile = INodeFileWithSnapshot ...), this becomes a non-trivial issue. (was: Similar with HDFS-4757, we need to update the inodeMap after node replacement. Because a lot of node replacement happens in the snapshot branch (e.g., INodeDirectory - INodeDirectoryWithSnapshot, INodeDirectory - INodeDirectorySnapshottable, INodeFile - INodeFileWithSnapshot ...), this becomes a non-trivial issue.) Update inodeMap after node replacement -- Key: HDFS-4760 URL: https://issues.apache.org/jira/browse/HDFS-4760 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Jing Zhao Assignee: Jing Zhao Similar with HDFS-4757, we need to update the inodeMap after node replacement. Because a lot of node replacement happens in the snapshot branch (e.g., INodeDirectory = INodeDirectoryWithSnapshot, INodeDirectory = INodeDirectorySnapshottable, INodeFile = INodeFileWithSnapshot ...), this becomes a non-trivial issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4650) Add rename test in TestSnapshot
[ https://issues.apache.org/jira/browse/HDFS-4650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-4650: - Component/s: (was: datanode) test Priority: Minor (was: Major) Hadoop Flags: Reviewed +1 patch looks good. Add rename test in TestSnapshot --- Key: HDFS-4650 URL: https://issues.apache.org/jira/browse/HDFS-4650 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, test Reporter: Jing Zhao Assignee: Jing Zhao Priority: Minor Attachments: HDFS-4650.001.patch Add more unit tests and update current unit tests to cover different cases for rename with existence of snapshottable directories and snapshots. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4650) Add rename test in TestSnapshot
[ https://issues.apache.org/jira/browse/HDFS-4650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642478#comment-13642478 ] Tsz Wo (Nicholas), SZE commented on HDFS-4650: -- With this, we now have 101 snapshot tests. {noformat} Tests run: 101, Failures: 0, Errors: 0, Skipped: 0 {noformat} Add rename test in TestSnapshot --- Key: HDFS-4650 URL: https://issues.apache.org/jira/browse/HDFS-4650 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, test Reporter: Jing Zhao Assignee: Jing Zhao Priority: Minor Attachments: HDFS-4650.001.patch Add more unit tests and update current unit tests to cover different cases for rename with existence of snapshottable directories and snapshots. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-4650) Add rename test in TestSnapshot
[ https://issues.apache.org/jira/browse/HDFS-4650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE resolved HDFS-4650. -- Resolution: Fixed Fix Version/s: Snapshot (HDFS-2802) I have committed this. Thanks, Jing! Add rename test in TestSnapshot --- Key: HDFS-4650 URL: https://issues.apache.org/jira/browse/HDFS-4650 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, test Reporter: Jing Zhao Assignee: Jing Zhao Priority: Minor Fix For: Snapshot (HDFS-2802) Attachments: HDFS-4650.001.patch Add more unit tests and update current unit tests to cover different cases for rename with existence of snapshottable directories and snapshots. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2802) Support for RW/RO snapshots in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-2802: - Attachment: h2802_20130425.patch h2802_20130425.patch Support for RW/RO snapshots in HDFS --- Key: HDFS-2802 URL: https://issues.apache.org/jira/browse/HDFS-2802 Project: Hadoop HDFS Issue Type: New Feature Components: datanode, namenode Reporter: Hari Mankude Assignee: Tsz Wo (Nicholas), SZE Attachments: 2802.diff, 2802.patch, 2802.patch, h2802_20130417.patch, h2802_20130422.patch, h2802_20130423.patch, h2802_20130425.patch, HDFS-2802.20121101.patch, HDFS-2802-meeting-minutes-121101.txt, HDFSSnapshotsDesign.pdf, snap.patch, snapshot-design.pdf, snapshot-design.tex, snapshot-one-pager.pdf, Snapshots20121018.pdf, Snapshots20121030.pdf, Snapshots.pdf, snapshot-testplan.pdf Snapshots are point in time images of parts of the filesystem or the entire filesystem. Snapshots can be a read-only or a read-write point in time copy of the filesystem. There are several use cases for snapshots in HDFS. I will post a detailed write-up soon with with more information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4759) snapshotDiff of two invalid snapshots but with same name returns success
[ https://issues.apache.org/jira/browse/HDFS-4759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-4759: Attachment: HDFS-4759.001.patch Thanks for the catch Ramya! Upload the patch for the fix. Also update the unit test to cover this case. snapshotDiff of two invalid snapshots but with same name returns success Key: HDFS-4759 URL: https://issues.apache.org/jira/browse/HDFS-4759 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Ramya Sunil Assignee: Jing Zhao Fix For: Snapshot (HDFS-2802) Attachments: HDFS-4759.001.patch snapshotDiff of two invalid snapshots which have the same names returns a success. $ hadoop dfs -ls /user/foo/hdfs-snapshots/.snapshot Found 1 items drwx-- - foo foo 0 2013-04-26 00:53 /user/foo/hdfs-snapshots/.snapshot/s1 $ hadoop snapshotDiff /user/foo/hdfs-snapshots invalid invalid Difference between snapshot invalid and snapshot invalid under directory /user/foo/hdfs-snapshots: -bash-4.1$ echo $? 0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira