[ 
https://issues.apache.org/jira/browse/HDFS-12191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16139264#comment-16139264
 ] 

Yongjun Zhang commented on HDFS-12191:
--------------------------------------

Thanks a lot [~manojg] for the detailed review. I'm uploading new rev 002 to 
address them.

* 1.1 changed to "dfs.namenode.snapshot.skip.accesstime-only-change"
* 1.2 exceeding 80 chars is consistent in this file, so kept as is. Hard to 
avoid because of the long string
* 2. reverted. This is the eft over from adding/removing debug messages.
* 3. addressed
* 4. This seems the best place to me to have the change. Otherwise we need to 
change the interface to communicate the value to here somehow. If you have 
thoughts, would be happy to hear.
* 5. added a test to verify that access time is captured when there is other 
modification.

Would you please take a look again?

Thanks.




> Provide option to not capture the accessTime change of a file to snapshot if 
> no other modification has been done
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-12191
>                 URL: https://issues.apache.org/jira/browse/HDFS-12191
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs, namenode
>    Affects Versions: 3.0.0-beta1
>            Reporter: Yongjun Zhang
>            Assignee: Yongjun Zhang
>         Attachments: HDFS-12191.001.patch, HDFS-12191.002.patch
>
>
> Currently, if the accessTime of a file changed before a snapshot is taken, 
> this accessTime will be captured in the snapshot, even if there is no other 
> modifications made to this file.
> Because of this, when we calculate snapshotDiff, more work need to be done 
> for this file, e,g,, metadataEquals method will be called, even if there is 
> no modification is made (thus not recorded to snapshotDiff). This can cause 
> snapshotDiff to slow down quite a lot when there are a lot of files to be 
> examined.
> This jira is to provide an option to skip capturing accessTime only change to 
> snapshot. Thus snapshotDiff can be done faster.
> When accessTime of a file changed, if there is other modification to the 
> file, the access time will still be captured in snapshot.
> Sometimes we want accessTime be captured to snapshot, such that when 
> restoring from the snapshot, we know the accessTime of this snapshot. So this 
> new feature is optional, and is controlled by a config property.
> Worth to mention is, how accurately the acessTime is captured is dependent on 
> the following config that has default value of 1 hour, which means new access 
> within an hour of previous access will not be captured.
> {code}
> public static final String  DFS_NAMENODE_ACCESSTIME_PRECISION_KEY =
>       
> HdfsClientConfigKeys.DeprecatedKeys.DFS_NAMENODE_ACCESSTIME_PRECISION_KEY;
> public static final long    DFS_NAMENODE_ACCESSTIME_PRECISION_DEFAULT = 
> 3600000;
> {code}
> .



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to