[
https://issues.apache.org/jira/browse/HBASE-29905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HBASE-29905:
-----------------------------------
Labels: pull-request-available (was: )
> BackupLogCleaner retains old WAL files due to stale entries in system:backup
> table
> ----------------------------------------------------------------------------------
>
> Key: HBASE-29905
> URL: https://issues.apache.org/jira/browse/HBASE-29905
> Project: HBase
> Issue Type: Bug
> Components: backup&restore
> Reporter: Jan Van Besien
> Priority: Major
> Labels: pull-request-available
>
> The backup:system table stores trslm: (table-region-server-log-map) rows with
> the row key format: {{trslm:\0}}
> Each row's value is a protobuf-serialized map of {{\{RegionServer → WAL
> timestamp}}}, representing the WAL position up to which each RegionServer has
> been backed up for that table.
> BackupLogCleaner uses this information to decide what WAL files to cleanup,
> as follows:
> * During backup completion (FullTableBackupClient.java:192 /
> IncrementalTableBackupClient.java:330), writeRegionServerLogTimestamp()
> writes a trslm: row for each table in the backup, recording the latest WAL
> timestamp per RS.
> * Immediately after, readLogTimestampMap() (BackupSystemTable.java:802)
> scans all trslm: rows for that backup root — every table that has ever been
> backed up to that root, not just the tables in the current backup. This full
> map is stored into the BackupInfo object
> (backupInfo.setTableSetTimestampMap(...)) and persisted as part of the
> session: row in backup:system.
> * BackupLogCleaner (BackupLogCleaner.java:89-142) reads the most recent
> BackupInfo per backup root and iterates over its tableSetTimestampMap. For
> each RegionServer found across all tables, it computes the minimum timestamp
> as the "preservation boundary" for that server. WALs older than or equal to
> this boundary can be deleted; newer ones are retained. A single stale table
> with a year-old timestamp for any RS will pin WAL retention for that RS all
> the way back, preventing WAL cleanup.
> The root cause is that there is no code anywhere that deletes trslm: rows.
> They are only written (overwritten) when a backup runs for that specific
> table. Two scenarios create stale rows:
> * (a) Table removed from backup (because the table is no longer included in
> backups or simple because the table is deleted).
> * (b) Regionserver decommissioned
> Problem (a) was observed in production (workaround was to remove the stale
> entries manually).
> To fix this, I think we need to have a cleanup mechanism. Perhaps we can
> filter readLogTimestampMap() results to only include tables in the current
> backup info, and delete everything else (or only filter, without delete, but
> then the stale entries still remain in the table).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)