[ https://issues.apache.org/jira/browse/HBASE-7360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13593432#comment-13593432 ]
Ted Yu commented on HBASE-7360: ------------------------------- bq. since taking or restoring a snapshot is something rare Some users would perform snapshot periodically. bq. the impact is the same as the impact of the cleaner after a table delete. Table deletion would ultimately remove data. Sanpshotting, on the other hand, creates copy (mostly through references) of data. Can you elaborate on why the impact is same ? bq. you're just asking stuff to the NN with max N RS connection Assuming N is the number of region servers, I think the above holds for any moment in time. bq. so the time required to do the snapshot is the same as the time required to do a roundtrip to the NN I tend to disagree. See code from ReferenceRegionHFilesTask: {code} // snapshot directories to store the hfile reference List<Path> snapshotFamilyDirs = TakeSnapshotUtils.getFamilySnapshotDirectories(snapshot, snapshotDir, families); LOG.debug("Add hfile references to snapshot directories:" + snapshotFamilyDirs); for (int i = 0; i < families.length; i++) { FileStatus family = families[i]; Path familyDir = family.getPath(); // get all the hfiles in the family FileStatus[] hfiles = FSUtils.listStatus(fs, familyDir, fileFilter); // if no hfiles, then we are done with this family if (hfiles == null || hfiles.length == 0) { LOG.debug("Not hfiles found for family: " + familyDir + ", skipping."); continue; } // make the snapshot's family directory Path snapshotFamilyDir = snapshotFamilyDirs.get(i); fs.mkdirs(snapshotFamilyDir); // create a reference for each hfile for (FileStatus hfile : hfiles) { // references are 0-length files, relying on file name. Path referenceFile = new Path(snapshotFamilyDir, hfile.getPath().getName()); LOG.debug("Creating reference for:" + hfile.getPath() + " at " + referenceFile); if (!fs.createNewFile(referenceFile)) { throw new IOException("Failed to create reference file:" + referenceFile); } } } {code} bq. do we want a format 1 now and a format 2 later on or just a format 2 later on? In the second half of the question, did you mean 'a format 2 now' ? I think this discussion is about reducing maintenance overhead, from both operations and code development points of view. I would vote for reaching format 2 in the first 0.94 release with snapshot feature. > Snapshot 0.94 Backport > ----------------------- > > Key: HBASE-7360 > URL: https://issues.apache.org/jira/browse/HBASE-7360 > Project: HBase > Issue Type: New Feature > Components: snapshots > Affects Versions: 0.94.3 > Reporter: Matteo Bertozzi > Assignee: Matteo Bertozzi > Fix For: 0.94.6 > > Attachments: 7360-v1.patch, HBASE-7299-0.94.patch, > HBASE-7360-jiras.txt, HBASE-7360-v0.patch > > > Backport snapshot code to 0.94 > The main changes needed to backport the snapshot are related to the protobuf > vs writable rpc. > Offline Snapshot > * HBASE-6610 HFileLink: Hardlink alternative for snapshot restore > * HBASE-6765 Take a Snapshot interface > ** This patch is significantly different because of the RPC changes between > 0.94/0.96 > * HBASE-6571 Generic multi-thread/cross-process error handling framework > * HBASE-6353 Snapshots shell > * HBASE-7107 Snapshot References Utils (FileSystem Visitor) > * HBASE-6836 Offline snapshots > * HBASE-6865 Snapshot File Cleaners > * HBASE-6777 Snapshot Restore Interface > ** This patch is significantly different because of the RPC changes between > 0.94/0.96 > * HBASE-6230 Clone/Restore Snapshots > * HBASE-6802 Export Snapshot > * HBASE-7240 Cleanup old snapshots on start > * HBASE-7174 Refactor snapshot file cleaner cache to use Snapshot > * HBASE-7367 Snapshot coprocessor and ACL security > * HBASE-7311 Add snapshot information to hbase master webui > ** This patch is significantly different because of the RPC changes between > 0.94/0.96 > * HBASE-7418 HFileLink flaky test > * HBASE-7420 TestSnapshotExceptionSnare and TestWALReferenceTask missing > test category annotation and failing TestCheckTestClasses > * HBASE-7206 ForeignException framework v2 (simplifies and replaces > HBASE-6571) > * HBASE-7430 TestSnapshotDescriptionUtils break compaction/scanner tests > (EnvironmentEdge issue) > * HBASE-7436 Improve stack trace info dumped by > ForeignExceptionSnare#rethrowException > * HBASE-7339 Splitting an HFileLink causes region servers to go down. > * HBASE-7439 HFileLink should not use the configuration from the Filesystem > * HBASE-7452 Change ForeignExceptionListener#receive(String, FE) to only be > #receive(FE) > * HBASE-7208 Transition Offline Snapshots to ForeignExceptions and Refactor > for merge > * HBASE-7207 Consolidate snapshot related classes into fewer packages > * HBASE-7400 ExportSnapshot mapper closes the FileSystem > * HBASE-7352 clone operation from HBaseAdmin can hang forever > * HBASE-7294 Check for snapshot file cleaners on start > * HBASE-7354 Snapshot Info/Debug Tool > * HBASE-7423 HFileArchiver should not use the configuration from the > Filesystem > * HBASE-7453 HBASE-7423 snapshot followup > * HBASE-7212 Globally Barriered Procedure Mechanism > * HBASE-6864 Online snapshots scaffolding > * HBASE-7321 Simple Flush Snapshot > * HBASE-7496 TestZKProcedure fails interrmittently. > * HBASE-7484 Fix Restore with schema changes > * HBASE-7419 revisit hfilelink file name format > * HBASE-7467 CleanerChore checkAndDeleteDirectory not deleting empty > directories > * HBASE-7214 CleanerChore logs too much, so much so it obscures all else > that is going on > * HBASE-7523 Snapshot attempt with the name of a previously taken snapshot > fails sometimes. > * HBASE-7480 Explicit message for not allowed snapshot on meta tables > * HBASE-7537 .regioninfo not created by createHRegion() > * HBASE-7535 Fix restore reference files > * HBASE-7552 Pass bufferSize param to FileLinkInputStream constructor within > FileLink.open method, and remove unnecessary import packages. > * HBASE-7501 Introduce MetaEditor method that both adds and deletes rows in > .META. table > * HBASE-7365/HBASE-7389 Safer table creation and deletion using .tmp dir > * HBASE-7547 Fix findbugs warnings in snapshot classes > * HBASE-7548 Fix javadoc warnings in snapshot classes > * HBASE-7538 Improve snapshot related error and exception messages > * HBASE-7536 Add test that confirms that multiple concurrent snapshot > requests are rejected > * HBASE-7583 Fixes and cleanups > * HBASE-7604 Remove duplicated code from HFileLink > * HBASE-7616 NPE in ZKProcedure.nodeCreated > * HBASE-7625 Remove duplicated logFSTree() from > TestRestoreFlushSnapshotFromClient > * HBASE-7627 UnsupportedOperationException in CatalogJanitor thread > * HBASE-7622 Add table descriptor verification after snapshot restore > * HBASE-7651 RegionServerSnapshotManager fails with CancellationException if > previous snapshot fails in per region task > * HBASE-7666 More logging improvements in online snapshots code > * HBASE-7689 CloneTableHandler notify completion too early > * HBASE-7699 Replace LOG.info() with LOG.debug() in FSUtils.listStatus() > * HBASE-7703 Eventually all online snapshots failing due to Timeout at same > regionserver. > * HBASE-7720 Improve logging messages of failed snapshot attempts. > * HBASE-7795 Race in the Restore Archiving > * HBASE-7788 Fix flakey TestRestore*SnapshotFromClient#testCloneSnapshot > * HBASE-7858 cleanup before merging snapshots branch to trunk > * HBASE-7889 Fix javadoc warnings in snapshot classes > * HBASE-7911 Remove duplicated code from CreateTableHandler > * HBASE-7969 Rename HBaseAdmin#getCompletedSnapshots as > HBaseAdmin#listSnapshots > * HBASE-7299 TestMultiParallel fails intermittently in trunk builds -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira