[
https://issues.apache.org/jira/browse/HBASE-29296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
guluo resolved HBASE-29296.
---------------------------
Fix Version/s: 2.7.0
3.0.0-beta-2
2.6.4
2.5.13
Resolution: Fixed
> Missing critical snapshot expiration checks
> -------------------------------------------
>
> Key: HBASE-29296
> URL: https://issues.apache.org/jira/browse/HBASE-29296
> Project: HBase
> Issue Type: Bug
> Components: backup&restore, snapshots
> Affects Versions: 2.6.2
> Reporter: Dimas Shidqi Parikesit
> Priority: Critical
> Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2, 2.6.4, 2.5.13
>
>
> In HBase it is crucial to prevent expired snapshots returned to clients to
> ensure correctness. There have been existing efforts (e.g., HBASE-27671 and
> HBASE-28704) adding snapshot expiration checks in different scenarios to
> avoid such issues. However, we found such protection is not consistent.
> Specifically, several operations still miss such checks in the latest hbase
> version (5dafa9e). Their patterns are similar to the previous tickets
> mentioned above. In practice, we observed expired snapshots still returning
> to clients successfully without generating any alarms.
> We have written test cases to prove these issues can be reproduced
> successfully (see attached). We also attach the manual steps in case anyone
> is interested.
> Your insights are very much appreciated. We will continue following up this
> issue until it is resolved.
> Reproducing steps (3 scenarios in total)
> 1. Restore
> Doing a restore on full backup will succeed even if the snapshot has expired.
> This expiration can happen if during the backup, `hbase.master.snapshot.ttl`
> was set.
> Steps to reproduce this bug:
> A. Start an HBase cluster, and set `hbase.master.snapshot.ttl` config value
> B. Create a table
> C. Create a full backup using `hbase backup create full
> hdfs://host5:9000/data/backup -t tableName`
> D. Wait until the snapshot has expired
> E. Restore the table using `hbase restore hdfs://host5:9000/data/backup
> <backup_id>`
> F. Check that the table is restored successfully
> We propose to add a snapshot expiration check on
> RestoreTool.java:createAndRestoreTable to prevent this issue.
>
> 2. Incremental backup
> Incremental backup is done based on a previous full backup. Incremental
> backup will succeed even if the full backup has expired.
> Steps to reproduce this bug:
> A. Start an HBase cluster, and set `hbase.master.snapshot.ttl` config value
> B. Create a table
> C. Create a full backup using `hbase backup create full
> hdfs://host5:9000/data/backup -t tableName`
> D. Wait until the snapshot has expired
> E. Create an incremental backup using `hbase backup create incremental
> hdfs://host5:9000/data/backup -t tableName`
> F. Check that the backup succeed
> We propose to add a snapshot expiration check on
> IncrementalTableBackupClient.java:verifyCfCompatibility to prevent this issue.
>
> 3. Snapshot procedure
> We found that it is possible to create a snapshot with a TTL value so low
> that it will expire before the SnapshotProcedure has finished. The
> SnapshotProcedure will finish normally as if the snapshot is fine.
> Steps to reproduce this bug:
> A. Start an HBase cluster and create a table
> B. Create a snapshot using hbase shell with TTL=1
> `snapshot 'mytable', 'snapshot1234', \{TTL => 1}`
> C. Check that the command finished without an error, and the snapshot has
> expired
> This behavior is only possible if the user accidentally sets the TTL to be
> too low or if the SnapshotProcedure is interrupted after the
> `SNAPSHOT_WRITE_SNAPSHOT_INFO` but before it’s fully finished.
> We propose to add an expiration check in the `SNAPSHOT_COMPLETE_SNAPSHOT`
> phase right before the snapshot is marked as completed to ensure that the
> snapshot hasn’t expired before the SnapshotProcedure is considered
> successfully finished.
> Granted, we’re not sure whether this one is a bug or an intended behavior
--
This message was sent by Atlassian Jira
(v8.20.10#820010)