[ 
https://issues.apache.org/jira/browse/HBASE-17330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15771727#comment-15771727
 ] 

Jianwei Cui commented on HBASE-17330:
-------------------------------------

Thanks for pointing out the mod time problem, [~stack]. I tried the patch 
locally as:
1. start a client to take snapshot periodically;
2. make {{SnapshotFileCache#refreshCache}} log the loading hfile names each 
time it scheduled.
The log shows {{SnapshotFileCache}} could load the hfiles referenced by 
snapshots taken before {{refreshCache}} starting. However, as you mentioned, 
relying on the mod time is risky, the accuracy of mod time depends on the 
implementation of underlying file system, and the mod time could also be 
updated(such as by {{FSNamesystem#setTimes}}). To be more safe, we can make 
{{SnapshotFileCache#getUnreferencedFiles}} load hfile names through on-disk 
snapshots if the passed file is not in memory cache? as:
{code}
  public synchronized Iterable<FileStatus> 
getUnreferencedFiles(Iterable<FileStatus> files,
      final SnapshotManager snapshotManager)
      throws IOException {
    ...
    for (FileStatus file : files) {
      String fileName = file.getPath().getName();
      if (!refreshed && !cache.contains(fileName)) {
        refreshCache(); // ==> Always load hfile names through on-disk 
snapshots(not consider the mod time).
        refreshed = true;
      }
      if (cache.contains(fileName)) {
        continue;
      }
{code}

> SnapshotFileCache will always refresh the file cache
> ----------------------------------------------------
>
>                 Key: HBASE-17330
>                 URL: https://issues.apache.org/jira/browse/HBASE-17330
>             Project: HBase
>          Issue Type: Bug
>          Components: snapshots
>    Affects Versions: 2.0.0, 1.3.1, 0.98.23
>            Reporter: Jianwei Cui
>            Assignee: Jianwei Cui
>            Priority: Minor
>             Fix For: 2.0.0, 1.4.0
>
>         Attachments: HBASE-17330-v1.patch, HBASE-17330-v2.patch
>
>
> In {{SnapshotFileCache#refreshCache}}, the {{hasChanges}} will be judged as:
> {code}
>     try {
>       FileStatus dirStatus = fs.getFileStatus(snapshotDir);
>       lastTimestamp = dirStatus.getModificationTime();
>       hasChanges |= (lastTimestamp >= lastModifiedTime); // >= will make 
> hasChanges always be true
> {code}
> The  {{(lastTimestamp >= lastModifiedTime)}} will make {{hasChanges}} always 
> be true because {{lastTimestamp}} will be updated as:
> {code}
> this.lastModifiedTime = lastTimestamp;
> {code}
> So, SnapshotFileCache will always refresh the file cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to