[ https://issues.apache.org/jira/browse/HDFS-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15903712#comment-15903712 ]
Manoj Govindassamy commented on HDFS-11402: ------------------------------------------- [~yzhangal], Any review comments on the patch ? Please let me know your thoughts. > HDFS Snapshots should capture point-in-time copies of OPEN files > ---------------------------------------------------------------- > > Key: HDFS-11402 > URL: https://issues.apache.org/jira/browse/HDFS-11402 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs > Affects Versions: 2.6.0 > Reporter: Manoj Govindassamy > Assignee: Manoj Govindassamy > Attachments: HDFS-11402.01.patch, HDFS-11402.02.patch > > > *Problem:* > 1. When there are files being written and when HDFS Snapshots are taken in > parallel, Snapshots do capture all these files, but these being written files > in Snapshots do not have the point-in-time file length captured. That is, > these open files are not frozen in HDFS Snapshots. These open files > grow/shrink in length, just like the original file, even after the snapshot > time. > 2. At the time of File close or any other meta data modification operation on > these files, HDFS reconciles the file length and records the modification in > the last taken Snapshot. All the previously taken Snapshots continue to have > those open Files with no modification recorded. So, all those previous > snapshots end up using the final modification record in the last snapshot. > Thus after the file close, file lengths in all those snapshots will end up > same. > Assume File1 is opened for write and a total of 1MB written to it. While the > writes are happening, snapshots are taken in parallel. > {noformat} > |---Time---T1-----------T2-------------T3----------------T4------> > |-----------------------Snap1----------Snap2-------------Snap3---> > |---File1.open---write---------write-----------close-------------> > {noformat} > Then at time, > T2: > Snap1.File1.length = 0 > T3: > Snap1.File1.length = 0 > Snap2.File1.length = 0 > <File1 write completed and closed> > T4: > Snap1.File1.length = 1MB > Snap2.File1.length = 1MB > Snap3.File1.length = 1MB > *Proposal* > 1. At the time of taking Snapshot, {{SnapshotManager#createSnapshot}} can > optionally request {{DirectorySnapshottableFeature#addSnapshot}} to freeze > open files. > 2. {{DirectorySnapshottableFeature#addSnapshot}} can consult with > {{LeaseManager}} and get a list INodesInPath for all open files under the > snapshot dir. > 3. {{DirectorySnapshottableFeature#addSnapshot}} after the Snapshot creation, > Diff creation and updating modification time, can invoke > {{INodeFile#recordModification}} for each of the open files. This way, the > Snapshot just taken will have a {{FileDiff}} with {{fileSize}} captured for > each of the open files. > 4. Above model follows the current Snapshot and Diff protocols and doesn't > introduce any any disk formats. So, I don't think we will be needing any new > FSImage Loader/Saver changes for Snapshots. > 5. One of the design goals of HDFS Snapshot was ability to take any number of > snapshots in O(1) time. LeaseManager though has all the open files with > leases in-memory map, an iteration is still needed to prune the needed open > files and then run recordModification on each of them. So, it will not be a > strict O(1) with the above proposal. But, its going be a marginal increase > only as the new order will be of O(open_files_under_snap_dir). In order to > avoid HDFS Snapshots change in behavior for open files and avoid change in > time complexity, this improvement can be made under a new config > {{"dfs.namenode.snapshot.freeze.openfiles"}} which by default can be > {{false}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org