[ https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13271616#comment-13271616 ]
Hari Mankude commented on HDFS-2802: ------------------------------------ @Eli, Regarding scenario #3, consider a hbase setup with huge dataset in production. A new app has been developed which needs to be validated against production dataset. It is not feasible to copy the entire dataset to a test setup. At the same time, app is not ready for production and it is not safe to have the app modify the data in the production database. One of the solutions for these types of problems is to take a RW snapshot of the production dataset and then have the development app run against the RW snapshot. After the app testing is done, RW snap is deleted. This assumes that the cluster has sufficient compute capacity and incremental storage capacity to support RW snaps. Regarding appends, current prototype of snapshot relies on the filesize that is available at the namenode. So, if a file is appended after snap is taken, then it is a no-op from a snap perspective. If a snap is taken of a file which has append pipeline setup, inode is of type underconstruction in the NN. Prototype relies on filesize that is available on the NN for snaps. This might not be perfect and I have some ideas on trying to acquire more upto-date filesize. I thought that truncate is not supported currently in the trunk. If you are referring to deletes, prototype handles deletes correctly without issues. I will post a more detailed doc after I am done with HA related work. > Support for RW/RO snapshots in HDFS > ----------------------------------- > > Key: HDFS-2802 > URL: https://issues.apache.org/jira/browse/HDFS-2802 > Project: Hadoop HDFS > Issue Type: New Feature > Components: data-node, name-node > Affects Versions: 0.24.0 > Reporter: Hari Mankude > Assignee: Hari Mankude > Attachments: snapshot-one-pager.pdf > > > Snapshots are point in time images of parts of the filesystem or the entire > filesystem. Snapshots can be a read-only or a read-write point in time copy > of the filesystem. There are several use cases for snapshots in HDFS. I will > post a detailed write-up soon with with more information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira