thomassarens created HBASE-28502:
------------------------------------

             Summary: Backup manifest of full backup contains incomplete table 
list
                 Key: HBASE-28502
                 URL: https://issues.apache.org/jira/browse/HBASE-28502
             Project: HBase
          Issue Type: Bug
          Components: backup&restore
    Affects Versions: 2.6.0, 4.0.0-alpha-1
            Reporter: thomassarens
         Attachments: TestBackupManifest.java

Noticed that the {{BackupManifest#getTableNames}} is returning only a single 
table instead of the complete list of tables that were requested for the backup 
in case of a full backup, in case of an incremental backup the manifest table 
list seem complete.

Checking the {{TableBackupClient#addManifest}} method shows why:
 * While looping over the included tables the manifest is stored per table and 
the comment mentions something about storing the manifest with the table 
directory:

{code:java}
// Since we have each table's backup in its own directory structure,
// we'll store its manifest with the table directory.
for (TableName table : backupInfo.getTables()) {
  manifest = new BackupManifest(backupInfo, table);
  ArrayList<BackupImage> ancestors = backupManager.getAncestors(backupInfo, 
table);
  for (BackupImage image : ancestors) {
    manifest.addDependentImage(image);
  }

  if (type == BackupType.INCREMENTAL) {
    // We'll store the log timestamps for this table only in its manifest.
    Map<TableName, Map<String, Long>> tableTimestampMap = new HashMap<>();
    tableTimestampMap.put(table, backupInfo.getIncrTimestampMap().get(table));
    manifest.setIncrTimestampMap(tableTimestampMap);
    ArrayList<BackupImage> ancestorss = backupManager.getAncestors(backupInfo);
    for (BackupImage image : ancestorss) {
      manifest.addDependentImage(image);
    }
  }
  manifest.store(conf);
}{code}
 
 * but the manifest path is based on the backup root dir and backup id, so it 
is not on a table directory level: 
{code:java}
Path manifestFilePath = new 
Path(HBackupFileSystem.getBackupPath(backupImage.getRootDir(), 
backupImage.getBackupId()), MANIFEST_FILE_NAME);{code}

 * so each call to {{manifest.store(conf)}} is just overwriting the same 
manifest file
 * for incremental backups the "complete" manifest is stored as well with 
{{manifest.store(conf)}} and using the exact same path, so that explains why it 
is correct for incremental backups:

{code:java}
// For incremental backup, we store a overall manifest in
// <backup-root-dir>/WALs/<backup-id>
// This is used when created the next incremental backup
if (type == BackupType.INCREMENTAL) {
manifest = new BackupManifest(backupInfo);
// set the table region server start and end timestamps for incremental backup
manifest.setIncrTimestampMap(backupInfo.getIncrTimestampMap());
ArrayList<BackupImage> ancestors = backupManager.getAncestors(backupInfo);
for (BackupImage image : ancestors) {
manifest.addDependentImage(image);
}
manifest.store(conf);
}{code}

 * the comment related to the manifest path being 
{{<backup-root-dir>/WALs/<backup-id>}} is incorrect

 
I created a simple test that verifies this issue [^TestBackupManifest.java] , 
but no idea how the fix this. Perhaps only the overall manifest file should be 
stored on {{<backup-root-dir>/<backup-id> }}level, but that goes against the 
comments here so not sure.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to