[ 
https://issues.apache.org/jira/browse/HDFS-12191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140691#comment-16140691
 ] 

Manoj Govindassamy commented on HDFS-12191:
-------------------------------------------

Thanks for working on the patch revision [~yzhangal]. 

1. {{INode}}
{noformat}
public abstract class INode {
.. 
   private static boolean dontCaptureAccessTimeOnlyChangeInSnapshot = false;
   
   public static void setDontCaptureAccessTimeOnlyChangeInSnapshot(boolean s) {
       LOG.info("Setting dontCaptureAccessTimeOnlyChangeInSnapshot to " + s);
       dontCaptureAccessTimeOnlyChangeInSnapshot = s;
   }

   public static boolean getDontCaptureAccessTimeOnlyChangeInSnapshot() {
       return dontCaptureAccessTimeOnlyChangeInSnapshot;
   }
{noformat}
* Abstract class INode doesn't look a right place for placing the snapshot 
logic. The callers of {{INode#setAccessTime()}} can pass in the needed details 
to skip recording the modifications
* FSNamesystem#setAccessTimes() has all the needed details to make the decision 
whether to record the accesstime changes in the snapshots or not. So, shall we 
pass in the details from here?


> Provide option to not capture the accessTime change of a file to snapshot if 
> no other modification has been done
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-12191
>                 URL: https://issues.apache.org/jira/browse/HDFS-12191
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs, namenode
>    Affects Versions: 3.0.0-beta1
>            Reporter: Yongjun Zhang
>            Assignee: Yongjun Zhang
>         Attachments: HDFS-12191.001.patch, HDFS-12191.002.patch
>
>
> Currently, if the accessTime of a file changed before a snapshot is taken, 
> this accessTime will be captured in the snapshot, even if there is no other 
> modifications made to this file.
> Because of this, when we calculate snapshotDiff, more work need to be done 
> for this file, e,g,, metadataEquals method will be called, even if there is 
> no modification is made (thus not recorded to snapshotDiff). This can cause 
> snapshotDiff to slow down quite a lot when there are a lot of files to be 
> examined.
> This jira is to provide an option to skip capturing accessTime only change to 
> snapshot. Thus snapshotDiff can be done faster.
> When accessTime of a file changed, if there is other modification to the 
> file, the access time will still be captured in snapshot.
> Sometimes we want accessTime be captured to snapshot, such that when 
> restoring from the snapshot, we know the accessTime of this snapshot. So this 
> new feature is optional, and is controlled by a config property.
> Worth to mention is, how accurately the acessTime is captured is dependent on 
> the following config that has default value of 1 hour, which means new access 
> within an hour of previous access will not be captured.
> {code}
> public static final String  DFS_NAMENODE_ACCESSTIME_PRECISION_KEY =
>       
> HdfsClientConfigKeys.DeprecatedKeys.DFS_NAMENODE_ACCESSTIME_PRECISION_KEY;
> public static final long    DFS_NAMENODE_ACCESSTIME_PRECISION_DEFAULT = 
> 3600000;
> {code}
> .



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to