Ray Mattingly created HBASE-28696:
-------------------------------------
Summary: BackupSystemTable can create huge delete batches that
should be partitioned instead
Key: HBASE-28696
URL: https://issues.apache.org/jira/browse/HBASE-28696
Project: HBase
Issue Type: Bug
Reporter: Ray Mattingly
When successfully taking an incremental backup, one of our final steps is to
delete bulk load metadata from the system table for the bulk loads that needed
to be captured in the given backup. This means that we will basically truncate
the entire bulk loads system table in a single batch of the deletes after
successfully taking an incremental backup. This logic occurs in
{{{}BackupSystemTable#deleteBulkLoadedRows{}}}:
{code:java}
/*
* Removes rows recording bulk loaded hfiles from backup table
* @param lst list of table names
* @param rows the rows to be deleted
*/
public void deleteBulkLoadedRows(List<byte[]> rows) throws IOException {
try (Table table = connection.getTable(bulkLoadTableName)) {
List<Delete> lstDels = new ArrayList<>();
for (byte[] row : rows) {
Delete del = new Delete(row);
lstDels.add(del);
LOG.debug("orig deleting the row: " + Bytes.toString(row));
}
table.delete(lstDels);
LOG.debug("deleted " + rows.size() + " original bulkload rows");
}
} {code}
Depending on your usage, one may run tons of bulk loads between backups, so
this design is needlessly fragile. We should partition these deletes so that we
never erroneously fail a backup due to this.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)