[ 
https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13271616#comment-13271616
 ] 

Hari Mankude commented on HDFS-2802:
------------------------------------

@Eli,
Regarding scenario #3, consider a hbase setup with huge dataset in production. 
A new app has been developed which needs to be validated against production 
dataset. It is not feasible to copy the entire dataset to a test setup. At the 
same time, app is not ready for production and it is not safe to have the app 
modify the data in the production database. One of the solutions for these 
types of problems is to take a RW snapshot of the production dataset and then 
have the development app run against the RW snapshot. After the app testing is 
done, RW snap is deleted. This assumes that the cluster has sufficient compute 
capacity and incremental storage capacity to support RW snaps.

Regarding appends, current prototype of snapshot relies on the filesize that is 
available at the namenode. So, if a file is appended after snap is taken, then 
it is a no-op from a snap perspective. If a snap is taken of a file which has 
append pipeline setup, inode is of type underconstruction in the NN. Prototype 
relies on filesize that is available on the NN for snaps. This might not be 
perfect and I have some ideas on trying to acquire more upto-date filesize.  

I thought that truncate is not supported currently in the trunk. If you are 
referring to deletes, prototype handles deletes correctly without issues. 

I will post a more detailed doc after I am done with HA related work.
                
> Support for RW/RO snapshots in HDFS
> -----------------------------------
>
>                 Key: HDFS-2802
>                 URL: https://issues.apache.org/jira/browse/HDFS-2802
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: data-node, name-node
>    Affects Versions: 0.24.0
>            Reporter: Hari Mankude
>            Assignee: Hari Mankude
>         Attachments: snapshot-one-pager.pdf
>
>
> Snapshots are point in time images of parts of the filesystem or the entire 
> filesystem. Snapshots can be a read-only or a read-write point in time copy 
> of the filesystem. There are several use cases for snapshots in HDFS. I will 
> post a detailed write-up soon with with more information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to