[ https://issues.apache.org/jira/browse/HDFS-12191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140691#comment-16140691 ]
Manoj Govindassamy commented on HDFS-12191: ------------------------------------------- Thanks for working on the patch revision [~yzhangal]. 1. {{INode}} {noformat} public abstract class INode { .. private static boolean dontCaptureAccessTimeOnlyChangeInSnapshot = false; public static void setDontCaptureAccessTimeOnlyChangeInSnapshot(boolean s) { LOG.info("Setting dontCaptureAccessTimeOnlyChangeInSnapshot to " + s); dontCaptureAccessTimeOnlyChangeInSnapshot = s; } public static boolean getDontCaptureAccessTimeOnlyChangeInSnapshot() { return dontCaptureAccessTimeOnlyChangeInSnapshot; } {noformat} * Abstract class INode doesn't look a right place for placing the snapshot logic. The callers of {{INode#setAccessTime()}} can pass in the needed details to skip recording the modifications * FSNamesystem#setAccessTimes() has all the needed details to make the decision whether to record the accesstime changes in the snapshots or not. So, shall we pass in the details from here? > Provide option to not capture the accessTime change of a file to snapshot if > no other modification has been done > ---------------------------------------------------------------------------------------------------------------- > > Key: HDFS-12191 > URL: https://issues.apache.org/jira/browse/HDFS-12191 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, namenode > Affects Versions: 3.0.0-beta1 > Reporter: Yongjun Zhang > Assignee: Yongjun Zhang > Attachments: HDFS-12191.001.patch, HDFS-12191.002.patch > > > Currently, if the accessTime of a file changed before a snapshot is taken, > this accessTime will be captured in the snapshot, even if there is no other > modifications made to this file. > Because of this, when we calculate snapshotDiff, more work need to be done > for this file, e,g,, metadataEquals method will be called, even if there is > no modification is made (thus not recorded to snapshotDiff). This can cause > snapshotDiff to slow down quite a lot when there are a lot of files to be > examined. > This jira is to provide an option to skip capturing accessTime only change to > snapshot. Thus snapshotDiff can be done faster. > When accessTime of a file changed, if there is other modification to the > file, the access time will still be captured in snapshot. > Sometimes we want accessTime be captured to snapshot, such that when > restoring from the snapshot, we know the accessTime of this snapshot. So this > new feature is optional, and is controlled by a config property. > Worth to mention is, how accurately the acessTime is captured is dependent on > the following config that has default value of 1 hour, which means new access > within an hour of previous access will not be captured. > {code} > public static final String DFS_NAMENODE_ACCESSTIME_PRECISION_KEY = > > HdfsClientConfigKeys.DeprecatedKeys.DFS_NAMENODE_ACCESSTIME_PRECISION_KEY; > public static final long DFS_NAMENODE_ACCESSTIME_PRECISION_DEFAULT = > 3600000; > {code} > . -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org