guangxuCheng commented on a change in pull request #769: HBASE-23202 ExportSnapshot (import) will fail if copying files to root directory takes longer than cleaner TTL URL: https://github.com/apache/hbase/pull/769#discussion_r340045617
########## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/SnapshotFileCache.java ########## @@ -251,6 +261,31 @@ private void refreshCache() throws IOException { this.snapshots.putAll(newSnapshots); } + @VisibleForTesting + List<String> getSnapshotsInProgress() throws IOException { + List<String> snapshotInProgress = Lists.newArrayList(); + // only add those files to the cache, but not to the known snapshots + Path snapshotTmpDir = new Path(snapshotDir, SnapshotDescriptionUtils.SNAPSHOT_TMP_DIR_NAME); + FileStatus[] running = FSUtils.listStatus(fs, snapshotTmpDir); + if (running != null) { + for (FileStatus run : running) { + try { + snapshotInProgress.addAll(fileInspector.filesUnderSnapshot(run.getPath())); + } catch (CorruptedSnapshotException e) { + // See HBASE-16464 + if (e.getCause() instanceof FileNotFoundException) { + // If the snapshot is corrupt, we will delete it + fs.delete(run.getPath(), true); + LOG.warn("delete the " + run.getPath() + " due to exception:", e.getCause()); Review comment: Hmmm, there maybe race condition between ExportSnapshot and SnapshotCleaner. Copying Snapshot Manifest is a fast operation. Maybe we can add a time threshold. When we catch CorruptedSnapshotException, if the modification time of the snapshot folder exceeds a certain time threshold, we will delete it, otherwise we will ignore this cleanup operation. WDYT? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services