[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480657#comment-13480657 ] Suresh Srinivas commented on HDFS-2802: --- Not sure if you read the discussion in section snapshot of being written files. I will add more details to the design. My comment earlier was related to strict consistency requirements. Support for RW/RO snapshots in HDFS --- Key: HDFS-2802 URL: https://issues.apache.org/jira/browse/HDFS-2802 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Reporter: Hari Mankude Assignee: Hari Mankude Attachments: snap.patch, snapshot-one-pager.pdf, Snapshots20121018.pdf Snapshots are point in time images of parts of the filesystem or the entire filesystem. Snapshots can be a read-only or a read-write point in time copy of the filesystem. There are several use cases for snapshots in HDFS. I will post a detailed write-up soon with with more information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4022) Replication not happening for appended block
[ https://issues.apache.org/jira/browse/HDFS-4022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480661#comment-13480661 ] Uma Maheswara Rao G commented on HDFS-4022: --- Thanks Vinay for the patch. Thanks a lot, Nicholas for your reviews. I will commit it in some time today. Replication not happening for appended block Key: HDFS-4022 URL: https://issues.apache.org/jira/browse/HDFS-4022 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: suja s Assignee: Vinay Priority: Blocker Attachments: HDFS-4022.patch, HDFS-4022.patch, HDFS-4022.patch, HDFS-4022.patch Block written and finalized Later append called. Block GenTS got changed. DN side log Can't send invalid block BP-407900822-192.xx.xx.xx-1348830837061:blk_-9185630731157263852_108738 logged continously NN side log INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Error report from DatanodeRegistration(192.xx.xx.xx, storageID=DS-2040532042-192.xx.xx.xx-50010-1348830863443, infoPort=50075, ipcPort=50020, storageInfo=lv=-40;cid=123456;nsid=116596173;c=0): Can't send invalid block BP-407900822-192.xx.xx.xx-1348830837061:blk_-9185630731157263852_108738 also logged continuosly. The block checked for tansfer is the one with old genTS whereas the new block with updated genTS exist in the data dir. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4088) Remove throws QuotaExceededException from an INodeDirectoryWithQuota constructor
[ https://issues.apache.org/jira/browse/HDFS-4088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480688#comment-13480688 ] Hudson commented on HDFS-4088: -- Integrated in Hadoop-Yarn-trunk #9 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/9/]) HDFS-4088. Remove throws QuotaExceededException from an INodeDirectoryWithQuota constructor. (Revision 1400345) Result = FAILURE szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1400345 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectoryWithQuota.java Remove throws QuotaExceededException from an INodeDirectoryWithQuota constructor -- Key: HDFS-4088 URL: https://issues.apache.org/jira/browse/HDFS-4088 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Priority: Minor Fix For: 2.0.3-alpha Attachments: h4088_20121019.patch The constructor body does not throw QuotaExceededException. We should remove it from the declaration. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3483) Better error message when hdfs fsck is run against a ViewFS config
[ https://issues.apache.org/jira/browse/HDFS-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480715#comment-13480715 ] Hudson commented on HDFS-3483: -- Integrated in Hadoop-Hdfs-0.23-Build #410 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/410/]) svn merge -c 1394864 FIXES: HDFS-3483. Better error message when hdfs fsck is run against a ViewFS config. Contributed by Stephen Fritz. (Revision 1400218) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1400218 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java Better error message when hdfs fsck is run against a ViewFS config -- Key: HDFS-3483 URL: https://issues.apache.org/jira/browse/HDFS-3483 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.0.0-alpha Reporter: Stephen Chu Assignee: Stephen Fritz Labels: newbie Fix For: 2.0.3-alpha, 0.23.5 Attachments: core-site.xml, HDFS-3483.patch, hdfs-site.xml I'm running a HA + secure + federated cluster. When I run hdfs fsck /nameservices/ha-nn-uri/, I see the following: bash-3.2$ hdfs fsck /nameservices/ha-nn-uri/ FileSystem is viewfs://oracle/ DFSck exiting. Any path I enter will return the same message. Attached are my core-site.xml and hdfs-site.xml. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3873) Hftp assumes security is disabled if token fetch fails
[ https://issues.apache.org/jira/browse/HDFS-3873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480718#comment-13480718 ] Hudson commented on HDFS-3873: -- Integrated in Hadoop-Hdfs-0.23-Build #410 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/410/]) svn merge -c 1393777 FIXES: HDFS-3996. Add debug log removed in HDFS-3873 back. Contributed by Eli Collins (Revision 1400216) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1400216 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HftpFileSystem.java Hftp assumes security is disabled if token fetch fails -- Key: HDFS-3873 URL: https://issues.apache.org/jira/browse/HDFS-3873 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.3, 3.0.0, 2.0.2-alpha Reporter: Daryn Sharp Assignee: Daryn Sharp Fix For: 0.23.3, 2.0.2-alpha Attachments: HDFS-3873.branch-23.patch, HDFS-3873.patch Hftp ignores all exceptions generated while trying to get a token, based on the assumption that it means security is disabled. Debugging problems is excruciatingly difficult when security is enabled but something goes wrong. Job submissions succeed, but tasks fail because the NN rejects the user as unauthenticated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3996) Add debug log removed in HDFS-3873 back
[ https://issues.apache.org/jira/browse/HDFS-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480719#comment-13480719 ] Hudson commented on HDFS-3996: -- Integrated in Hadoop-Hdfs-0.23-Build #410 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/410/]) svn merge -c 1393777 FIXES: HDFS-3996. Add debug log removed in HDFS-3873 back. Contributed by Eli Collins (Revision 1400216) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1400216 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HftpFileSystem.java Add debug log removed in HDFS-3873 back --- Key: HDFS-3996 URL: https://issues.apache.org/jira/browse/HDFS-3996 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.2-alpha Reporter: Eli Collins Assignee: Eli Collins Priority: Minor Fix For: 2.0.3-alpha, 0.23.5 Attachments: hdfs-3996.txt Per HDFS-3873 let's add the debug log back. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4088) Remove throws QuotaExceededException from an INodeDirectoryWithQuota constructor
[ https://issues.apache.org/jira/browse/HDFS-4088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480723#comment-13480723 ] Hudson commented on HDFS-4088: -- Integrated in Hadoop-Hdfs-trunk #1201 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1201/]) HDFS-4088. Remove throws QuotaExceededException from an INodeDirectoryWithQuota constructor. (Revision 1400345) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1400345 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectoryWithQuota.java Remove throws QuotaExceededException from an INodeDirectoryWithQuota constructor -- Key: HDFS-4088 URL: https://issues.apache.org/jira/browse/HDFS-4088 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Priority: Minor Fix For: 2.0.3-alpha Attachments: h4088_20121019.patch The constructor body does not throw QuotaExceededException. We should remove it from the declaration. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4088) Remove throws QuotaExceededException from an INodeDirectoryWithQuota constructor
[ https://issues.apache.org/jira/browse/HDFS-4088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480734#comment-13480734 ] Hudson commented on HDFS-4088: -- Integrated in Hadoop-Mapreduce-trunk #1231 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1231/]) HDFS-4088. Remove throws QuotaExceededException from an INodeDirectoryWithQuota constructor. (Revision 1400345) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1400345 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeDirectoryWithQuota.java Remove throws QuotaExceededException from an INodeDirectoryWithQuota constructor -- Key: HDFS-4088 URL: https://issues.apache.org/jira/browse/HDFS-4088 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Priority: Minor Fix For: 2.0.3-alpha Attachments: h4088_20121019.patch The constructor body does not throw QuotaExceededException. We should remove it from the declaration. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480745#comment-13480745 ] Hari Mankude commented on HDFS-2802: Todd, another option is to look at the inodesUnderConstruction in the NN and query the DNs for the exact filesize at the time of taking snapshot. Even with this, the filesize that is obtained will be at the instant. Applications like hbase will have to deal with hlogs that could have incomplete log entries when an un-cordinated snapshot is taken at the hdfs. A better approach is to have the application reach a quiesce point and then take a snap. This is normally done for oracle (hot backup mode) and sqlserver so that an application consistent snapshot can be taken. Also, createSnap()/removeSnap() has the writeLock() on the FSNamesystem which will ensure that there are no other metadata updates when snap is being taken. Support for RW/RO snapshots in HDFS --- Key: HDFS-2802 URL: https://issues.apache.org/jira/browse/HDFS-2802 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Reporter: Hari Mankude Assignee: Hari Mankude Attachments: snap.patch, snapshot-one-pager.pdf, Snapshots20121018.pdf Snapshots are point in time images of parts of the filesystem or the entire filesystem. Snapshots can be a read-only or a read-write point in time copy of the filesystem. There are several use cases for snapshots in HDFS. I will post a detailed write-up soon with with more information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480757#comment-13480757 ] Todd Lipcon commented on HDFS-2802: --- Hi Suresh. Yes, I read the design there. In fact I think the design is based my comment on HDFS-3960 from a few weeks ago. But after further thinking, I think that design is too weak. Here's why: If the first DN in the pipeline has to RPC on every hflush, that would be way too many RPCs. HBase for example flushes several hundred times per second per server, so a 1000 node HBase cluster under heavy load would quickly take down a NameNode. So instead of the DN immediately RPCing on every hflush, it has to wait until the next heartbeat and report lengths with the heartbeat. Given this, it may be 5-10 seconds between the hflush and the report of the length to the datanode. This means that the snapshot will get a length which is either 5-10 seconds too old or 5-10 seconds too new (depending on whether it uses the last reported length or if it waits until the next heartbeat to finalize the snapshot) A 5-10 second inconsistency window is plenty to break the situation described above: it's quite likely to get data layer modifications 1 and 3 wthout getting namespace modifications 2 and 4, or vice versa. On the other hand, the design I proposed above _does_ handle this, because the DN isn't reporting the length at the time of heartbeat. Instead it's reporting a length which is causally consistent with the namespace from the perspective of the writer of that file. bq. Todd, another option is to look at the inodesUnderConstruction in the NN and query the DNs for the exact filesize at the time of taking snapshot We can't query the DNs while holding the NN lock. It could take several seconds or longer to contact all the DNs in a loaded 1000+ node cluster, and potentially 10s of seconds if one of the nodes is actually down. So you'd have to drop the lock, at which point we're back to the above issue with consistency against concurrent NS modifications. bq. A better approach is to have the application reach a quiesce point and then take a snap. This is normally done for oracle (hot backup mode) and sqlserver so that an application consistent snapshot can be taken. The difference is that quiescing a single-node or small-cluster database like SQL Server or RAC is relatively easy. On the other hand, quiescing a 1000 node HBase cluster would take a while, and I don't think users will really tolerate a global stop-the-world to make a snapshot. This is especially true for use cases like DR/backup where you expect to take snapshots as often as once every few minutes. Support for RW/RO snapshots in HDFS --- Key: HDFS-2802 URL: https://issues.apache.org/jira/browse/HDFS-2802 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Reporter: Hari Mankude Assignee: Hari Mankude Attachments: snap.patch, snapshot-one-pager.pdf, Snapshots20121018.pdf Snapshots are point in time images of parts of the filesystem or the entire filesystem. Snapshots can be a read-only or a read-write point in time copy of the filesystem. There are several use cases for snapshots in HDFS. I will post a detailed write-up soon with with more information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480772#comment-13480772 ] Suresh Srinivas commented on HDFS-2802: --- bq. In fact I think the design is based my comment on HDFS-3960 Actually we have been mulling over some of these ideas for a long time. HDFS-3960 was just started to get the discussion going. The design we are proposing is to let DNs send the length. The length known is what goes into the snapshot instead of recording either zero length for block under construction or having to initiate communication with datanodes/implicitly getting it from DN. From what I have heard from some HBase folks 5-10 seconds lagging should be workable for them. That is why I want to to talk to few HBase folks in the design review. Support for RW/RO snapshots in HDFS --- Key: HDFS-2802 URL: https://issues.apache.org/jira/browse/HDFS-2802 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Reporter: Hari Mankude Assignee: Hari Mankude Attachments: snap.patch, snapshot-one-pager.pdf, Snapshots20121018.pdf Snapshots are point in time images of parts of the filesystem or the entire filesystem. Snapshots can be a read-only or a read-write point in time copy of the filesystem. There are several use cases for snapshots in HDFS. I will post a detailed write-up soon with with more information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HDFS-2802) Support for RW/RO snapshots in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480772#comment-13480772 ] Suresh Srinivas edited comment on HDFS-2802 at 10/20/12 4:34 PM: - bq. In fact I think the design is based my comment on HDFS-3960 Actually that is not true. We have been mulling over many of these ideas for a long time. HDFS-3960 was just create to get the discussion going. The design we are proposing is to let DNs send the length. The length known is what goes into the snapshot instead of recording either zero length for block under construction or having to initiate communication with datanodes/implicitly getting it from DN. From what I have heard from some HBase folks 5-10 seconds lagging should be workable for them. That is why I want to to talk to few HBase folks in the design review. was (Author: sureshms): bq. In fact I think the design is based my comment on HDFS-3960 Actually we have been mulling over some of these ideas for a long time. HDFS-3960 was just started to get the discussion going. The design we are proposing is to let DNs send the length. The length known is what goes into the snapshot instead of recording either zero length for block under construction or having to initiate communication with datanodes/implicitly getting it from DN. From what I have heard from some HBase folks 5-10 seconds lagging should be workable for them. That is why I want to to talk to few HBase folks in the design review. Support for RW/RO snapshots in HDFS --- Key: HDFS-2802 URL: https://issues.apache.org/jira/browse/HDFS-2802 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Reporter: Hari Mankude Assignee: Hari Mankude Attachments: snap.patch, snapshot-one-pager.pdf, Snapshots20121018.pdf Snapshots are point in time images of parts of the filesystem or the entire filesystem. Snapshots can be a read-only or a read-write point in time copy of the filesystem. There are several use cases for snapshots in HDFS. I will post a detailed write-up soon with with more information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480784#comment-13480784 ] Hari Mankude commented on HDFS-2802: Todd, I do not agree that your solution will be any beneficial to hbase than what is being proposed. Any type of txid information in DNs will be at the beginning of the transaction. If the client is writing in the middle of block, there is no way to know the exact size when snap was taken. Querying inodesUnderConstruction will give the block length at the time of the query. It is not possible to take an application consistent snapshot (one which does not require recovery) without coordination with the application. In fact, communication with DNs when snapshots are being taken will make the process of taking snapshots very slow while giving very little additional benefit. Support for RW/RO snapshots in HDFS --- Key: HDFS-2802 URL: https://issues.apache.org/jira/browse/HDFS-2802 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Reporter: Hari Mankude Assignee: Hari Mankude Attachments: snap.patch, snapshot-one-pager.pdf, Snapshots20121018.pdf Snapshots are point in time images of parts of the filesystem or the entire filesystem. Snapshots can be a read-only or a read-write point in time copy of the filesystem. There are several use cases for snapshots in HDFS. I will post a detailed write-up soon with with more information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480785#comment-13480785 ] Hari Mankude commented on HDFS-2802: Sorry hit the comment early. Additionally, including the sizes of non-finalized blocks in snapshots has implication that if the client dies and the non-finalized section is discarded, then snapshot might have pointers to non-existent blocks. Support for RW/RO snapshots in HDFS --- Key: HDFS-2802 URL: https://issues.apache.org/jira/browse/HDFS-2802 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Reporter: Hari Mankude Assignee: Hari Mankude Attachments: snap.patch, snapshot-one-pager.pdf, Snapshots20121018.pdf Snapshots are point in time images of parts of the filesystem or the entire filesystem. Snapshots can be a read-only or a read-write point in time copy of the filesystem. There are several use cases for snapshots in HDFS. I will post a detailed write-up soon with with more information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480803#comment-13480803 ] Todd Lipcon commented on HDFS-2802: --- bq. The design we are proposing is to let DNs send the length. The length known is what goes into the snapshot instead of recording either zero length for block under construction or having to initiate communication with datanodes/implicitly getting it from DN. From what I have heard from some HBase folks 5-10 seconds lagging should be workable for them. That is why I want to to talk to few HBase folks in the design review. I hope I qualify as an HBase folk? 5-10 seconds lagging on the *data* is probably fine. But inconsistency between metadata and namespace modifications is a lot tougher. Consider for example an application which uses a write-ahead log on HDFS to make a group of namespace modifications consistent. See HBASE-2231 for an example of a place where we currently have a dataloss bug for which the proposed fix is exactly this: 1. Write new files (compaction result) 2. Write to WAL that compaction is finished 3. Delete old files (compaction sources) On recovery, if we see the compaction finished entry in the WAL, then we roll forward the transaction and delete the source. But if the snapshot doesn't preserve ordering of the above operations, we risk either seeing the compaction finished when the namespace doesn't have the new files, which would result in an accidental deletion of a bunch of data. So I think we need a way to provide barriers between namespace and data layer modifications. The proposal I made above should achieve this. Another option is something that we've called super flush. This would be a flag on hflush() or hsync() indicating that the new length of the file needs to be persisted to the NameNode, not just the datanodes. It would be used by applications like HBase to determine consistency points for file lengths. bq. In fact, communication with DNs when snapshots are being taken will make the process of taking snapshots very slow while giving very little additional benefit. We should distinguish between two types of slowness for snapshots: 1) Slowness while holding a lock. This is unacceptable IMO - we must hold the lock for a bounded amount of time and never make an RPC while holding the lock. 2) Slowness before a snapshot is available for restore. This is acceptable. For example, if the user operation create snapshot holds the lock for 10ms, but the snapshot is initially in a COLLECTING_LENGTHS state while it waits for block lengths that seems acceptable. So long as the lengths are filled in by the next heartbeat (or two heartbeats from now) it should be complete (and thus ready for recovery) within the minute. Note that we don't need to wait for a heartbeat from every datanode. Instead, we just need to wait until, for each under-construction block in the snapshotted area, _one_ of its replicas has reported. When snapshotting a subtree without any open files, it would still be instant. bq. Additionally, including the sizes of non-finalized blocks in snapshots has implication that if the client dies and the non-finalized section is discarded, then snapshot might have pointers to non-existent blocks. I don't understand what you mean here...can you be more specific about the scenario? Support for RW/RO snapshots in HDFS --- Key: HDFS-2802 URL: https://issues.apache.org/jira/browse/HDFS-2802 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Reporter: Hari Mankude Assignee: Hari Mankude Attachments: snap.patch, snapshot-one-pager.pdf, Snapshots20121018.pdf Snapshots are point in time images of parts of the filesystem or the entire filesystem. Snapshots can be a read-only or a read-write point in time copy of the filesystem. There are several use cases for snapshots in HDFS. I will post a detailed write-up soon with with more information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480804#comment-13480804 ] Suresh Srinivas commented on HDFS-2802: --- bq. I hope I qualify as an HBase folk? Sure. But I want others to seek others feedback before even considering adding any more complexity to the design. Support for RW/RO snapshots in HDFS --- Key: HDFS-2802 URL: https://issues.apache.org/jira/browse/HDFS-2802 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Reporter: Hari Mankude Assignee: Hari Mankude Attachments: snap.patch, snapshot-one-pager.pdf, Snapshots20121018.pdf Snapshots are point in time images of parts of the filesystem or the entire filesystem. Snapshots can be a read-only or a read-write point in time copy of the filesystem. There are several use cases for snapshots in HDFS. I will post a detailed write-up soon with with more information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4079) Add SnapshotManager
[ https://issues.apache.org/jira/browse/HDFS-4079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480805#comment-13480805 ] Suresh Srinivas commented on HDFS-4079: --- Nicholas, I have two comments: # SnapshotManager could be an interface. That way if we need we can make it pluggable perhaps in the future. This we could do in a separate jira. # Second comment is to make for the existing classes such as INode where we are changing the access/visibility of methods, we should do it in trunk first. I am okay to make it in trunk and then merge it into this branch later. +1 for the patch. Add SnapshotManager --- Key: HDFS-4079 URL: https://issues.apache.org/jira/browse/HDFS-4079 Project: Hadoop HDFS Issue Type: Sub-task Components: name-node Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Attachments: h4079_20121019.patch SnapshotManager maintains a list for all the snapshottable directories in the namespace. It also supports snapshot related methods such as setting a directory to snapshottable, creating a snapshot, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480808#comment-13480808 ] Suresh Srinivas commented on HDFS-2802: --- Should we consider moving the consistency part of the discussion to HDFS-3960? Support for RW/RO snapshots in HDFS --- Key: HDFS-2802 URL: https://issues.apache.org/jira/browse/HDFS-2802 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Reporter: Hari Mankude Assignee: Hari Mankude Attachments: snap.patch, snapshot-one-pager.pdf, Snapshots20121018.pdf Snapshots are point in time images of parts of the filesystem or the entire filesystem. Snapshots can be a read-only or a read-write point in time copy of the filesystem. There are several use cases for snapshots in HDFS. I will post a detailed write-up soon with with more information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HDFS-2802) Support for RW/RO snapshots in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480547#comment-13480547 ] Suresh Srinivas edited comment on HDFS-2802 at 10/20/12 7:20 PM: - Thanks for the comments guys. bq. In some of the most commercially popular systems which implement snapshots, snapshots do not count against the disk quotas How do they handle disk quota use when the original file is deleted and only snapshots exists? That is the reason why counting the disk quota makes sense. bq. First, I'm concerned with the O(# of files + # of directories) nature of this design, both in terms of time taken to create a snapshot and the NN memory resources consumed. I agree with you on this. We wanted to begin with this approach and then optimize it further in memory. The initial patch uploaded here tried premature optimization both for memory and snapshot creation time and thus made the code really complicated. But this is a definite goal and that part of the design we will update as we continue to work. This is covered in open issues/future work section. comment 1: Agree with this part. As we continue the work, we can make a decision on this. For supporting RW, lets not make the design/implementation more complicated. comment 2: Will address this as we continue to add more details to the design in the next update. Comment 3, 6: I want to make sure you understand this is early design and we will continue to add more details. I think some of the questions will be answered by how this works: - Admin can mark directories as snapshottable using CLI - User then can create snapshots for these directories using CLI/API. A snapshot has a snapshot name and it is unique for given snapshot root. comment 4: If you look at snapshot implementation in other systems it is done at volume level. That is the parallel we are talking about. Comment 5, Comment 7, comment 10: As regards to consistency (comment 7), a system where snapshot is taken at the namespace without involving data layer cannot provide string consistency guarantee. I also think it may not be relevant where writers are different from the client that is taking the snapshot. Not sure what guarantee such a client can expect/depend on given writers are separate. We could discuss this during design review. I also think based on discussion with few HBase folks, they should be okay with it. Some thing to discuss with them. I am also not clear on their dependency on HDFS with hbase-6055. comment 8: This could change during implementation if we think access time may not be that important to maintain. comment 9: Agreed. I am leaning towards allowing it. comment 11: Will add usecases comment 12: See the volume comment and the document sort of covers this. We could discuss this further if the document is not clear. was (Author: sureshms): Thanks for the comments guys. bq. In some of the most commercially popular systems which implement snapshots, snapshots do not count against the disk quotas How do they handle disk quota use when the original file is deleted and only snapshots exit? That is the reason why counting the disk quota makes sense. bq. First, I'm concerned with the O(# of files + # of directories) nature of this design, both in terms of time taken to create a snapshot and the NN memory resources consumed. I agree with you on this. We wanted to begin with this approach and then optimize it further in memory. The initial patch uploaded here tried premature optimization both for memory and snapshot creation time and thus made the code really complicated. But this is a definite goal and that part of the design we will update as we continue to work. This is covered in open issues/future work section. comment 1: Agree with this part. As we continue the work, we can make a decision on this. For supporting RW, lets not make the design/implementation more complicated. comment 2: Will address this as we continue to add more details to the design in the next update. Comment 3, 6: I want to make sure you understand this is early design and we will continue to add more details. I think some of the questions will be answered by how this works: - Admin can mark directories as snapshottable using CLI - User then can create snapshots for these directories using CLI/API. A snapshot has a snapshot name and it is unique for given snapshot root. comment 4: If you look at snapshot implementation in other systems it is done at volume level. That is the parallel we are talking about. Comment 5, Comment 7, comment 10: As regards to consistency (comment 7), a system where snapshot is taken at the namespace without involving data layer cannot provide string consistency guarantee. I also think it may not be relevant where writers are different from the client that is taking
[jira] [Created] (HDFS-4094) Specific file type bulk Transfer into HDFS to a specified HDFS location with a track of the transfer number
Anurag G Vyas created HDFS-4094: --- Summary: Specific file type bulk Transfer into HDFS to a specified HDFS location with a track of the transfer number Key: HDFS-4094 URL: https://issues.apache.org/jira/browse/HDFS-4094 Project: Hadoop HDFS Issue Type: Wish Components: scripts Affects Versions: 3.0.0 Environment: Unix Reporter: Anurag G Vyas Fix For: 3.0.0 Need a script for a bulk transfer process into HDFS in such a way that the script must be able to identify a specific file type and move only that file type into the specified HDFS location. For example : Say I have a local directory called user/dir . In it I have txt files such as anurag123 , anurag234, vyas678 , ganesh345 , anurag277 , vyas345 ganesh789 etc. The script must take 2 inputs. One is the file type , say anurag. And the other an HDFS location. Then it must move all the files which has the name anurag in the directory to the specified HDFS location. Also it would be good if the script could keep track of the number of files it moved to HDFS by saving the details in a new file either in HDFS or in the local directory. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-4094) Specific file type bulk Transfer into HDFS to a specified HDFS location with a track of the transfer number
[ https://issues.apache.org/jira/browse/HDFS-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon resolved HDFS-4094. --- Resolution: Invalid This sounds like a request for a higher-order convenience script, not a feature we'd put in Hadoop proper. Specific file type bulk Transfer into HDFS to a specified HDFS location with a track of the transfer number Key: HDFS-4094 URL: https://issues.apache.org/jira/browse/HDFS-4094 Project: Hadoop HDFS Issue Type: Wish Components: scripts Affects Versions: 3.0.0 Environment: Unix Reporter: Anurag G Vyas Fix For: 3.0.0 Need a script for a bulk transfer process into HDFS in such a way that the script must be able to identify a specific file type and move only that file type into the specified HDFS location. For example : Say I have a local directory called user/dir . In it I have txt files such as anurag123 , anurag234, vyas678 , ganesh345 , anurag277 , vyas345 ganesh789 etc. The script must take 2 inputs. One is the file type , say anurag. And the other an HDFS location. Then it must move all the files which has the name anurag in the directory to the specified HDFS location. Also it would be good if the script could keep track of the number of files it moved to HDFS by saving the details in a new file either in HDFS or in the local directory. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480836#comment-13480836 ] Colin Patrick McCabe commented on HDFS-2802: Suresh said: bq. How do [other filesystems] handle disk quota use when the original file is deleted and only snapshots exists? That is the reason why counting the disk quota makes sense. ZFS has quotas and refquotas. The former includes snapshot overhead; the latter does not. Based on some Googling, I think that on NetApp devices, quotas do not include snapshot overhead. (at least by default). I think it makes sense to offer both kinds of quota, although we don't have to implement them both right away, of course. Support for RW/RO snapshots in HDFS --- Key: HDFS-2802 URL: https://issues.apache.org/jira/browse/HDFS-2802 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Reporter: Hari Mankude Assignee: Hari Mankude Attachments: snap.patch, snapshot-one-pager.pdf, Snapshots20121018.pdf Snapshots are point in time images of parts of the filesystem or the entire filesystem. Snapshots can be a read-only or a read-write point in time copy of the filesystem. There are several use cases for snapshots in HDFS. I will post a detailed write-up soon with with more information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13480840#comment-13480840 ] Suresh Srinivas commented on HDFS-2802: --- bq. ZFS has quotas and refquotas. The former includes snapshot overhead; the latter does not. Good to know. Some thing we should consider as well. Support for RW/RO snapshots in HDFS --- Key: HDFS-2802 URL: https://issues.apache.org/jira/browse/HDFS-2802 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Reporter: Hari Mankude Assignee: Hari Mankude Attachments: snap.patch, snapshot-one-pager.pdf, Snapshots20121018.pdf Snapshots are point in time images of parts of the filesystem or the entire filesystem. Snapshots can be a read-only or a read-write point in time copy of the filesystem. There are several use cases for snapshots in HDFS. I will post a detailed write-up soon with with more information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-4057) NameNode.namesystem should be private. Use getNamesystem() instead.
[ https://issues.apache.org/jira/browse/HDFS-4057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas resolved HDFS-4057. --- Resolution: Fixed Fix Version/s: 1.2.0 I committed the patch to branch-1. Thank you Brandon. NameNode.namesystem should be private. Use getNamesystem() instead. --- Key: HDFS-4057 URL: https://issues.apache.org/jira/browse/HDFS-4057 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 1.2.0 Reporter: Brandon Li Assignee: Brandon Li Priority: Minor Fix For: 1.2.0 Attachments: HDFS-4057.branch-1.patch, HDFS-4057.branch-1.patch NameNode.namesystem should be private. One should use NameNode.getNamesystem() to get it instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4072) On file deletion remove corresponding blocks pending replication
[ https://issues.apache.org/jira/browse/HDFS-4072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-4072: -- Fix Version/s: 1.2.0 I committed the patch to branch-1. On file deletion remove corresponding blocks pending replication Key: HDFS-4072 URL: https://issues.apache.org/jira/browse/HDFS-4072 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 3.0.0 Reporter: Jing Zhao Assignee: Jing Zhao Priority: Minor Fix For: 1.2.0, 3.0.0, 2.0.3-alpha Attachments: HDFS-4072.b1.001.patch, HDFS-4072.patch, HDFS-4072.trunk.001.patch, HDFS-4072.trunk.002.patch, HDFS-4072.trunk.003.patch, HDFS-4072.trunk.004.patch, TestPendingAndDelete.java Currently when deleting a file, blockManager does not remove records that are corresponding to the file's blocks from pendingRelications. These records can only be removed after timeout (5~10 min). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4095) Add snapshot related metrics
Jing Zhao created HDFS-4095: --- Summary: Add snapshot related metrics Key: HDFS-4095 URL: https://issues.apache.org/jira/browse/HDFS-4095 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao Assignee: Jing Zhao Add metrics for number of snapshots in the system, including 1) number of snapshot files, and 2) number of snapshot only files (snapshot file that are not deleted but the original file is already deleted). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4096) Add snapshot information to namenode WebUI
Jing Zhao created HDFS-4096: --- Summary: Add snapshot information to namenode WebUI Key: HDFS-4096 URL: https://issues.apache.org/jira/browse/HDFS-4096 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Jing Zhao Assignee: Jing Zhao Add snapshot information to namenode WebUI. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4097) provide CLI support for create/delete/list snapshots
Brandon Li created HDFS-4097: Summary: provide CLI support for create/delete/list snapshots Key: HDFS-4097 URL: https://issues.apache.org/jira/browse/HDFS-4097 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs client, name-node Affects Versions: Snapshot (HDFS-2802) Reporter: Brandon Li Assignee: Brandon Li provide CLI support for create/delete/list snapshots -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira