[ https://issues.apache.org/jira/browse/HBASE-29346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Duo Zhang resolved HBASE-29346. ------------------------------- Fix Version/s: 2.7.0 3.0.0-beta-2 2.6.3 2.5.12 Hadoop Flags: Reviewed Resolution: Fixed Pushed to all active branches. Thanks [~prathyu6] for contributing! > Multiple Snapshot restores on same restoreDir ends up in Dataloss > ----------------------------------------------------------------- > > Key: HBASE-29346 > URL: https://issues.apache.org/jira/browse/HBASE-29346 > Project: HBase > Issue Type: Bug > Components: snapshots > Reporter: Prathyusha > Assignee: Prathyusha > Priority: Critical > Labels: pull-request-available > Fix For: 2.7.0, 3.0.0-beta-2, 2.6.3, 2.5.12 > > > We restore snapshots to a temporary directory for Snapshot reads. > When restored multiple SnapshotManifests (both created on same table at t1, > t2 t2>t1), on the same temp dir, it deletes the merge parent regions from > {color:#de350b}/hbase/data/ instead of temp restore folder as part of > restore regions of{color} > [RestoreSnapshotHelper|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/RestoreSnapshotHelper.java#L416] > Reproduce steps > # Create a Snapshot on a table > # Restore that snapshot on a temporary restoreDirectory instead of the same > table > # Delete that snapshot from shell > # Disable compactions and trigger Merge > # Create another snapshot > # Restore that snapshot on to the same restoreDirectory from Step-2 > # It archives the closed parent regions from /hbase/data/ of actual table > instead of temporary restoreDirectory leaving dangling references in daughter > region which ends up in dataloss > # Restart the regionserver holding the merged daughter region and it will > end up in FAILED_OPEN state due to dangling reference files and the parent > store files are already archived > Proposed immediate fix - > RestoreSnapshotHelper does {{restore, add, remove}} regions. > Restore/Add operations use {{tableDir}} of RestoreSnapshotHelper (which is > constructed from {{{}restoreDir{}}}) to construct {{RegionDir}} paths > We should do the same strategy in removeRegions path also, > currently > [RestoreSnapshotHelper.removeHdfsRegion|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/RestoreSnapshotHelper.java#L416] > currently uses > [HFileArchiver.archiveRegion|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/backup/HFileArchiver.java#L104] > which essentially is constructing table from rootDir instead of restoreDir -- This message was sent by Atlassian Jira (v8.20.10#820010)